Predictive Analysis on Sensor Data Using Distributed Machine Learning
Keywords:Hadoop, Distributed Systems, Machine Learning, Sensors
The science of getting computers to act without being explicitly programmed is known as machine learning. Machine learning is comparable to data mining in terms of how it works. Both systems sift through data in order to find patterns. Machine learning, on the other hand, instead of extracting data for human comprehension as in data mining applications, uses that data to increase the program's own understanding. Data patterns are detected by machine learning programmes, which then alter Programme behaviours accordingly. This research is based on real-world data obtained from sensors in an oil and Gas Corporation that monitor drilling procedures and equipment. The sensor data is streamed in at a one-second period, resulting in 86400 rows of data per day. Ox data's H2O has chosen for this problem after researching state-of-the-art Big Data analytics tools such as Mahout, RHadoop, and Spark because of its rapid in-memory processing, robust machine learning engine, and ease of use. Missed values can be estimated using accurate predictive analytics of massive sensor data, or wrong readings can be replaced owing to malfunctioning sensors or a broken communication line. It can also be used to predict circumstances to aid in various decision-making processes, such as maintenance planning and operation. In this project, sensor data has been evaluated and anticipate output using the H2O tool. The machine learning techniques has been employed in distributed systems, such as connecting five nodes to accomplish parallel processing.
J. Stimpson, Alexander and L. Mary, Cummings, “Assessing intervention timing in computer-based education using machine learning algorithms,” IEEE Access 2, pp. 78-87, 2014.
L. Clifton, D. A. Clifton, M. A. Pimentel, P. J. Watkinson, and L. Tarassenko, “Predictive monitoring of mobile patients by combining clinical observations with data from wearable sensors,” IEEE Journal of Biomedical and Health Informatics, Vol. 18, No. 3, pp. 722-73, 2013.
A. Rahman, D. V. Smith and G. Timms, “A novel machine learning approach toward quality assessment of sensor data,” IEEE Sensors Journal, Vol. 14, No.4, pp.1035-1047, 2013.
H. Mostafaei, M. Esnaashari, and M. R. Meybodi, “A coverage monitoring algorithm based on learning automata for wireless sensor networks,” arXiv preprint arXiv: 1409.1515, 2014.
L. Georgopoulos, and M. Hasler, “Distributed machine learning in networks by consensus,” Neuro Computing, Vol. 124, pp. 2-12, 2014.
S. Scardapane, D. Wang, M. Panella, and A. Uncini, “Distributed learning for random vector functional-link networks,” Information Sciences, Vol. 301, pp. 271-284, 2015.
H. Mostafaei, “Stochastic barrier coverage in wireless sensor networks based on distributed learning automata,” Computer Communications, Vol. 55, pp. 51-61, 2015.
L. Li, J. Zhang, L. Jin, R. Guo, and D. Huang, “A distributed meta-learning system for Chinese entity relation extraction,” Neuro Computing, Vol. 149, pp. 1135-1142, 2015.
X. Bi, X. Zhao, G. Wang, P. Zhang, and C. Wang, “Distributed extreme learning machine with kernels based on map reduce,” Neuro Computing, Vol. 149, pp. 456-463, 2015.