Real-Time Semantic Indexing for High-Volume Data Streams

Authors

  • Yeshwanth Raj
  • Hassan Mohamed Mahdi
  • Dr. Benjamin Jones Abraham
  • Dr.S. Rama Sree
  • R. Kiruthika
  • Khusainov Ilyos Jamoliddin Ugli

DOI:

https://doi.org/10.51983/ijiss-2025.IJISS.15.3.47

Keywords:

Real-Time Data Handling, High-Rate Data Stream Processing, Semantic Indexing, Natural Language Understanding, Knowledge Graphs, And Highly Scalable Systems

Abstract

Rapidly accumulating high-volume datasets from sources like social media, IoT devices, and the financial market present substantial issues for real-time data processing, storage, and restoration. Such indexing data and traditional search approaches could not maintain the requisite velocity, magnitude, and polymorphism that these databases offer in a conceptually relevant form. This paper proposes a new model for real-time semantic indexing (RTSI). This model proposes enhancing information retrieval and analytic capabilities by incorporating semantics into the indexing process during data ingestion. Contextual meaning is assigned to data items in real time using lightweight natural language processing (NLP), entity recognition, topic modeling, and Knowledge embedding. The distributed architecture, constructed from scalable stream processing engines like Apache Flink or Kafka Streams, provides low-latency operational performance for practical implementations. We implemented the proposed System on multiple high-throughput datasets consisting of news feeds, social media posts, and sensor logs. Experimental results demonstrate that RTSI outperforms conventional search and analytic tasks in terms of real-time relevance and accuracy compared to keyword-based indexing. Additionally, the semantic layer enables context-aware alerting and anomaly detection trend monitoring. The System also has adaptability, supporting the continuous refinement of semantic representations with incoming data. By incorporating semantic techniques into real-time stream indexing, the study's results suggest enhancements to the responsiveness, intelligence, and scalability of data-driven applications, which are increasingly important.

Downloads

Published

30-09-2025

How to Cite

Raj, Y., Mahdi, H. M., Abraham, B. J., Rama Sree, S., Kiruthika, R., & Ugli, K. I. J. (2025). Real-Time Semantic Indexing for High-Volume Data Streams. Indian Journal of Information Sources and Services, 15(3), 423–431. https://doi.org/10.51983/ijiss-2025.IJISS.15.3.47