International Journal of Research and Development in Engineering Sciences (IJRDES ) | A Privacy-Preserving Universal Multimodal Framework for Real-Time Any-to-Any Transformation

A Privacy-Preserving Universal Multimodal Framework for Real-Time Any-to-Any Transformation

We propose a unified multimodal framework, Universal Multi-Modal Generation Enabling Any-to-Any Transformation, that enables seamless transformation between text, audio, and image inputs and outputs. The system integrates three core capabilities: speech understanding using Whisper, visual understanding through LLaVA, and speech synthesis via PyTorch-based text-to-speech models. All modules are deployed on-premise using Docker, providing a privacy-centric execution environment and reducing operational overhead associated with cloud processing. The framework supports advanced workflows including document/PDF-to-text extraction, text-to-speech conversion, and image-driven description generation, thereby enabling accessible and interactive multimodal content pipelines. The implementation emphasizes efficient orchestration and inference to meet real-time constraints. Experimental results across multiple cross-modal tasks demonstrate robust accuracy and consistently low latency, suggesting that local, containerized multimodal systems can deliver scalable performance for practical applications. The proposed approach is particularly relevant to accessibility, education, and content creation, where rapid modality conversion and data privacy are essential.

Research Type: Applied Research
Paper Type: Analytical Research Paper
Vol.8 , Issue 2 , Pages: 21 - 24, Mar 2026
Published on: 06 Mar, 2026
Issue Type: Regular

Cite Score

:

100
No. of authors

:

75
No. of Downloads

:

43

Cite Score

:

100
No. of authors

:

75
No. of Downloads

:

43

Cite Score

:

100
No. of authors

:

75
No. of Downloads

:

43

About Authors:

Ramakrishna Kolikipogu

India

Chaitanya Bharathi Institute of Technology (CBIT)

"""

Copyright © 2026, This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC-BY-NY-SA). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Corresponding Author: Ramakrishna Kolikipogu, krkrishna.cse@gmail.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Conflict of interest: The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Global Readers View

No. of Readers
167
No. of Reaction
0
No. of Comments
0
No. of Downloads
1

Or share your Opinion

Impressive 0
Good 0
Average 0
Pointless 0
Bad 0

A Privacy-Preserving Universal Multimodal Framework for Real-Time Any-to-Any Transformation

About Authors:

Ramakrishna Kolikipogu

Global Readers View

167

0

0

1

Comments(0)

Edited by:

Editor-In-Chief

IJRDES

Reviewed by:

Similar Papers

Internet Enhanced Smart Energy Netw...

Blockchain for Cybersecurity : Stre...

Scalable Deduplication for Privacy-...

Double Band E-Shaped Slotted Micros...

Authors’ other publications

Quick Links

Downloads

Support

Journal Contents

For Authors & Reviwers

Regular Issue - 2

A Privacy-Preserving Universal Multimodal Framework for Real-Time Any-to-Any Transformation

About Authors:

Ramakrishna Kolikipogu

Global Readers View

167

0

0

1

Comments(0)

Edited by:

Editor-In-Chief

IJRDES

Reviewed by:

Similar Papers

Internet Enhanced Smart Energy Netw...

Blockchain for Cybersecurity : Stre...

Scalable Deduplication for Privacy-...

Double Band E-Shaped Slotted Micros...

Authors’ other publications