ترجمه کامپیوتر - 10 صفحه
سال 2016
Big Data Tools: Haddop, MongoDB and Weka
ابزارهای کلان داده : آپاچی هدوپ، مانگو دی بی، وکا
Paula Catalina Jaraba Navas(&), Yesid Camilo Guacaneme Parra,
and José Ignacio Rodríguez Molano
http://link.springer.com/chapter/10.1007/978-3-319-40973-3_45
دانلود رایگان مقاله انگلیسی - ابزارهای کلان داده : آپاچی هدوپ، مانگو دی بی، وکا
چکیده
کلان داده اصطلاحی است که رشد نمایی تمام داده های دارای ساختار و بدون ساختار را از منابع مختلف توصیف می کند( پایگاه های داده ، شبکه های اجتماعی ، وب و غیره) و ممکن است برای یک شرکت سودمند باشد. این مقاله اهمیت فعلی کلان داده را با برخی از الگوریتم هایی نشان می دهد که ممکن است با پیشنهاد الگوهای نزول ، روش ها و آمیزش داده ها مورد استفاده قرار گیرند، کلان داده ها اطلاعات ارزشمندی را در زمان واقعی تولید می کنند و به ویژگی ها و برنامه های کاربردی برخی از ابزارهای رایج مورد استفاده برای تحلیل داده اشاره دارند بنابراین ممکن است به ایجاد اطلاعات کمک کنند، کلان داده یک تکنولوژی مناسب برای اجرای داده ها مطابق با نیازها یا اطلاعات مورد نیاز است.
کلید واژه ها: کلان داده ها، ابزارهای تحلیل ، هدوپ، مانگو دی بی، وکا
Abstract
Big Data is a term that describes the exponential growth of all sorts of data–structured and non-structured– from different sources (data bases, social networks, the web, etc.) and which, as per their use, may become a benefit or an advantage for a company. This paper shows the current importance of Big Data, together with some of the algorithms that may be used with the purpose of reveling patterns, trends and data associations that may generate valuable information in real time, mentioning characteristics and applications of some of the tools currently used for data analysis so they may help to establish which is the most suitable technology to be implemented according to the needs or information required.
Keywords
Big data Analysis tools Hadoop Mongodb Weka

References
1.Schroeck, M., Shockley, R., Smart, J., Morales, R., Tufano, P.: Analytics: the real-world use of big data. IBM Global Business Services, Saïd Business School, University of Oxford, pp. 1–20 (2012)
2. Boyd, D., Crawford, K.: Critical questions for big data. Inf. Commun. Soc. 15(5), 662–679 (2012)CrossRefGoogle Scholar
3. Katal, A., Wazid, M., Goudar, R.: Big data: issues, challenges, tools and good practices. In: 2013 Sixth International Conference on Contemporary Computing, pp. 404–409 (2013)
4. Chen, H., Chiang, R., Storey, V.: Business intelligence and analytics: from big data to big impact. MIS Q. 36(4), 1165–1188 (2012)Google Scholar
5. Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014)CrossRefGoogle Scholar
6. Purcell, B.: The emergence of ‘big data’ technology and analytics. J. Technol. Res. 4, 1–7 (2013)MathSciNetGoogle Scholar
7. Coronel, C., Morris, S., Rob, P.: Database Systems: Design, Implementation, and Management (2009)
8.Wu, X., Zhu, X., Wu, G., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)CrossRefGoogle Scholar
9. Demchenko, Y., De Laat, C., Membrey, P.: Defining architecture components of the big data ecosystem. In: 2014 International Conference on Collaboration Technologies and Systems, CTS 2014, pp. 104–112 (2014)
10. McKinsey & Company: Big data: The next frontier for innovation, competition, and productivity. McKinsey Glob. Inst., p. 156, June 2011
11. Desouza, K., Smith, K.: Big data for social innovation. Stanford Soc. Innov. Rev. 12(3), 38–43 (2014)Google Scholar
12.
Tsai, C., Lai, C., Chao, H., Vasilakos, A.: Big data analytics: a survey. J. Big Data 2(1), 21 (2015)CrossRefGoogle Scholar
13. Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)MathSciNetCrossRefGoogle Scholar
14.Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 37–54 (1996)
15. Bartere, M., Yenkar, V.: Review on data mining with big data. Int. J. Comput. Sci. Mob. Comput. 3(4), 97–102 (2014)Google Scholar
16.Menandas, J., Joshi, J.: Data mining with parallel processing technique for complexity reduction and characterization of big data. Glob. J. Advanced Research 1(1), 69–80 (2014)
17. Jain, K., Murty, M., Flynn, P.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
18.Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies, MSST2010 (2010)
19.Borthakur, D.: The hadoop distributed file system: Architecture and design. Hadoop Project Website, pp. 1–14 (2007)
20. Dittrich, J., Quian, J.: Efficient big data processing in hadoop mapreduce. In: Proceedings of the VLDB Endowment, vol. 5, no. 12, pp. 2014–2015 (2012)
21. MongoDB Inc 2008–2016. https://docs.mongodb.org/manual/introduction/
22. Boicea, A., Radulescu, F., Agapin, L.: MongoDB vs Oracle - database comparison. In: Proceedings of 3rd International Conference on Emerging. Intelligent Data and Web Technologies, EIDWT 2012, September 2012, pp. 330–335 (2012)
23.Gyorodi, C., Gyorodi, R., Pecherle, G., Olah, A.: A comparative study: MongoDB vs. MySQL. In: 13th International Conference on Engineering Modern Electric System (2015)
24. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software. ACM SIGKDD Explor. Newsl. 11(1), 10 (2009)CrossRefGoogle Scholar
25.Garner, S.: WEKA: the waikato environment for knowledge analysis. In: Proceedings of New Zealand Computer Science, pp. 57–64 (1995)
26. Bouckaert, R., Frank, E., Hall, M., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: WEKA—experiences with a java open-source project. J. Mach. Learn. Res. 11, 2533–2541 (2010)MATHGoogle Scholar
27. Witten, I., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.: Weka: practical machine learning tools and techniques with java implementations. Seminar 99, 192–196 (1999)