Books Of India Blog

Data Processing and Modeling With Hadoop

Data Processing and Modeling with Hadoop
Mastering Hadoop Ecosystem Including ETL, Data Vault, DMBok, GDPR, and Various Data-Centric Tools
Vinicius Aquino do Vale
Understand data in a simple way using a data lake.
● In-depth practical demonstration of Hadoop/Yarn concepts with numerous examples.
● Includes graphical illustrations and visual explanations for Hadoop commands and parameters.
● Includes details of dimensional modeling and Data Vault modeling.
● Includes details of how to create and define a structure to a data lake.
The book ‘Data Processing and Modeling with Hadoop’ explains how a distributed system works and its benefits in the big data era in a straightforward and clear manner. After reading the book, you will be able to plan and organize projects involving a massive amount of data.
The book describes the standards and technologies that aid in data management and compares them to other technology business standards. The reader receives practical guidance on how to segregate and separate data into zones, as well as how to develop a model that can aid in data evolution. It discusses security and the measures that are utilized to reduce the impact of security. Self-service analytics, Data Lake, Data Vault 2.0, and Data Mesh are discussed in the book.
After reading this book, the reader will have a thorough understanding of how to structure a data lake, as well as the ability to plan, organize, and carry out the implementation of a data-driven business with full governance and security.
● Learn the basics of components to the Hadoop Ecosystem.
● Understand the structure, files, and zones of a Data Lake.
● Learn to implement the security part of the Hadoop Ecosystem.
● Learn to work with the Data Vault 2.0 modeling.
● Learn to develop a strategy to define good governance.
● Learn new tools to work with Data and Big Data
This book caters to big data developers, technical specialists, consultants, and students who want to build good proficiency in big data. Knowing basic SQL concepts, modeling, and development would be good, although not mandatory.
1. Understanding the Current Moment
2. Defining the Zones
3. The Importance of Modeling
4. Massive Parallel Processing
5. Doing ETL/ELT
6. A Little Governance
7. Talking About Security
8. What Are the Next Steps?


Hadoop ecosystem, Data Vault 2.0, Data Lake, Zones (Raw, Trusted, Refined), Data Mesh, Data Driven, DMBok
Hadoop ecosystem, Data Vault 2.0, Data Lake, Zones (Raw, Trusted, Refined), Data Mesh, Data Driven, DMBok, Java, Kerberus, Data Quality, Machine Learning, Modeling, Self Service Analytics, Visualization Tools, Apache Foundation
ISBN: 9789391392284
eISBN: 9789391392369
BISAC ( 3 BISAC CODES REQUIRED, please refer )
COM096000, COM048000, COM062000, COM032000, COM005030, COM091000, COM051230, COM063000, COM046070,
COM096000    COMPUTERS / Parallel Processing
COM048000    COMPUTERS / Distributed Systems / General
COM062000    COMPUTERS / Data Science / Data Modeling & Design
COM032000    COMPUTERS / Information Technology
COM005030    COMPUTERS / Business & Productivity Software / Business Intelligence
COM091000    COMPUTERS / Distributed Systems / Cloud Computing
COM051230    COMPUTERS / Software Development & Engineering / General
COM063000    COMPUTERS / Document Management
COM046070    COMPUTERS / Operating Systems / Linux
Category: Big Data & Databases, Big Data & Databases ,Big Data & Databases
Concepts: Data Mining & Warehousing, Business Analytics, Database Design & Programming
INR: 699
USD: 29.95
Ebook ( 20 percent less than INR ): 560
Pages: 198
Size:  6*9 Inches
Release Date: 15-Nov-2021
Binding: Paperback
Vinicius Aquino do Vale is an experienced technical consultant who has been working with clients and partners for 15 years in the design of technological solutions. In his career, Vinicius has participated in large projects as a specialist in Big Data technologies, having advanced knowledge of the Hadoop ecosystem. He has worked on several Big Data projects in the largest companies in Brazil assisting in architecture design, implementation, configuration, ingestion, analysis and ETL. He participated in the construction of all data lake / smart data flows, in addition to integrating the entire system with analytics tools like QLikSense, QlikView, Tableau, Metabase, Tibco SpotFire, in addition to implementing security integration with AD / LDAP.
 Vinicius served as an MBA professor, teaching NoSQL, Data Ingestion and Parallel Mass Processing classes, as well as speaking at IT events for IT companies and communities. Vinicius is a PostgreSQL, MongoDB and Cassandra database specialist, as well as a Linux Server specialist: CentOS, Debian, RedHat, SUSE. Vinicius dedicated years to its improvement, obtaining international certifications such as: ITIL, LPIC (Linux), OCJP (Oracle Certified Professional, Java SE 6 Programmer), OCE-WCD (Oracle Certified Expert, Java EE 6 Web Component Developer), OCE-JPAD (Oracle Certified Expert, Java EE 6 Java Persistence API Developer), OCE-EJB (Oracle Certified Expert, Java EE 6 Enterprise JavaBeans Developer), Hadoop Administrator (Cloudera), PostgreSQL (EnterpriseDB).
He has extensive experience in Java development for the Web, working with several technologies and frameworks, and also has the ability to lead and coordinate projects with agile methodology with SCRUM / Kanban. In addition, vinicius has a domain over public clouds like Google Cloud (GCP), Azure and AWS.
In 2014, Vinicius founded his own education company, Sudoers, where he is the Founder and Professor of Technology, helping, training and mentoring young people to pursue careers in technology.


LinkedIn Profile:

BPB Publications
Asia’s Largest Publisher of IT Books
20 Ansari Road, Darya Ganj, New Delh-110002, India
Tel: (011)-23254990/ 91, 23267741, 43508428
Mob: 8459388882, 9313292760
e: sales@bpbonline.comOur Latest Catalogue: