Category: database

ArchitectureBig DataCertificationdatabaseDevelopment

70-777 Implementing Microsoft Azure Cosmos DB Solutions Certification Exam

The 70-777 Implementing Microsoft Azure Cosmos DB Solutions certification exam tests and validates your expertise in designing, building, and troubleshooting Azure Cosmos DB solutions. The exam is focused around the Azure Cosmos DB database service in the Microsoft Azure Cloud, and is targeted towards database developers, big data developers, and architects who are leveraging Azure Cosmos DB in their solutions. Read More

CertificationData Sciencedatabase

70-776: Perform Big Data Engineering on Microsoft Cloud Services Certification Exam (20776)

The 70-776 Performing Big Data Engineering on Microsoft Cloud Services certification exam tests and validates your expertise in designing analytics solutions and building operationalized solutions on Microsoft Azure. This exam covers data engineering topics around Azure SQL Data Warehouse, Azure Data Lake, Azure Data Factory, and Azure Stream Analytics.

Exam Target Audience

The 70-776 Performing Big Data Engineering on Microsoft Cloud Services certification exam is targeted towards Big Data Professionals. This exam is centered around designing analytics solutions and building operationalized solutions on Microsoft Azure. The primary Azure service areas covered on this Big Data exam are: Azure SQL Data Warehouse, Azure Data Lake, Azure Data Factory, and Azure Stream Analytics.

Candidates with experience and familiarity with the capabilities and features of batch data processing, real-time processing, and operationalization technologies are the targeted audience for this exam. These candidates will be able to apply Microsoft cloud technologies to solution designs, and implement big data analytics solutions.

Skills Measured

Here is a list of the skills and objectives measured on this exam. The percentages on the high level objective areas represents the percentage of the exam that is focused on that objective area.

  • Design and Implement Complex Event Processing By Using Azure Stream Analytics (15-20%)
    • Ingest data for real-time processing
      • Select appropriate data ingestion technology based on specific constraints; design partitioning scheme and select mechanism for partitioning; ingest and process data from a Twitter stream; connect to stream processing entities; estimate throughput, latency needs, and job footprint; design reference data streams
    • Design and implement Azure Stream Analytics
      • Configure thresholds, use the Azure Machine Learning UDF, create alerts based on conditions, use a machine learning model for scoring, train a model for continuous learning, use common stream processing scenarios
    • Implement and manage the streaming pipeline
      • Stream data to a live dashboard, archive data as a storage artifact for batch processing, enable consistency between stream processing and batch processing logic
    • Query real-time data by using the Azure Stream Analytics query language
      • Use built-in functions, use data types, identify query language elements, control query windowing by using Time Management, guarantee event delivery
  • Design and Implement Analytics by Using Azure Data Lake (25-30%)
    • Ingest data into Azure Data Lake Store
      • Create an Azure Data Lake Store (ADLS) account, copy data to ADLS, secure data within ADLS by using access control, leverage end-user or service-to-service authentication appropriately, tune the performance of ADLS, access diagnostic logs
    • Manage Azure Data Lake Analytics
      • Create an Azure Data Lake Analytics (ADLA) account, manage users, manage data sources, manage, monitor, and troubleshoot jobs, access diagnostic logs, optimize jobs by using the vertex view, identify historical job information
    • Extract and transform data by using U-SQL
      • Schematize data on read at scale; generate outputter files; use the U-SQL data types, use C# and U-SQL expression language; identify major differences between T-SQL and U-SQL; perform JOINS, PIVOT, UNPIVOT, CROSS APPLY, and Windowing functions in U-SQL; share data and code through U-SQL catalog; define benefits and use of structured data in U-SQL; manage and secure the Catalog
    • Extend U-SQL programmability
      • Use user-defined functions, aggregators, and operators, scale out user-defined operators, call Python, R, and Cognitive capabilities, use U-SQL user-defined types, perform federated queries, share data and code across ADLA and ADLS
    • Integrate Azure Data Lake Analytics with other services
      • Integrate with Azure Data Factory, Azure HDInsight, Azure Data Catalog, and Azure Event Hubs, ingest data from Azure SQL Data Warehouse
  • Design and Implement Azure SQL Data Warehouse Solutions (15-20%)
    • Design tables in Azure SQL Data Warehouse
      • Choose the optimal type of distribution column to optimize workflows, select a table geometry, limit data skew and process skew through the appropriate selection of distributed columns, design columnstore indexes, identify when to scale compute nodes, calculate the number of distributions for a given workload
    • Query data in Azure SQL Data Warehouse
      • Implement query labels, aggregate functions, create and manage statistics in distributed tables, monitor user queries to identify performance issues, change a user resource class
    • Integrate Azure SQL Data Warehouse with other services
      • Ingest data into Azure SQL Data Warehouse by using AZCopy, Polybase, Bulk Copy Program (BCP), Azure Data Factory, SQL Server Integration Services (SSIS), Create-Table-As-Select (CTAS), and Create-External-Table-As-Select (CETAS); export data from Azure SQL Data Warehouse; provide connection information to access Azure SQL Data Warehouse from Azure Machine Learning; leverage Polybase to access a different distributed store; migrate data to Azure SQL Data Warehouse; select the appropriate ingestion method based on business needs
  • Design and Implement Cloud-Based Integration by using Azure Data Factory (15-20%)
    • Implement datasets and linked services
      • Implement availability for the slice, create dataset policies, configure the appropriate linked service based on the activity and the dataset
    • Move, transform, and analyze data by using Azure Data Factory activities
      • Copy data between on-premises and the cloud, create different activity types, extend the data factory by using custom processing steps, move data to and from Azure SQL Data Warehouse
    • Orchestrate data processing by using Azure Data Factory pipelines
      • Identify data dependencies and chain multiple activities, model schedules based on data dependencies, provision and run data pipelines, design a data flow
    • Monitor and manage Azure Data Factory
      • Identify failures and root causes, create alerts for specified conditions, perform a redeploy, use the Microsoft Azure Portal monitoring tool
  • Manage and Maintain Azure SQL Data Warehouse, Azure Data Lake, Azure Data Factory, and Azure Stream Analytics (20-25%)
    • Provision Azure SQL Data Warehouse, Azure Data Lake, Azure Data Factory, and Azure Stream Analytics
      • Provision Azure SQL Data Warehouse, Azure Data Lake, and Azure Data Factory, implement Azure Stream Analytics
    • Implement authentication, authorization, and auditing
      • Integrate services with Azure Active Directory (Azure AD), use the local security model in Azure SQL Data Warehouse, configure firewalls, implement auditing, integrate services with Azure Data Factory
    • Manage data recovery for Azure SQL Data Warehouse, Azure Data Lake, and Azure Data Factory, Azure Stream Analytics
      • Backup and recover services, plan and implement geo-redundancy for Azure Storage, migrate from an on-premises data warehouse to Azure SQL Data Warehouse
    • Monitor Azure SQL Data Warehouse, Azure Data Lake, and Azure Stream Analytics
      • Manage concurrency, manage elastic scale for Azure SQL Data Warehouse, monitor workloads by using Dynamic Management Views (DMVs) for Azure SQL Data Warehouse, troubleshoot Azure Data Lake performance by using the Vertex Execution View
    • Design and implement storage solutions for big data implementations
      • Optimize storage to meet performance needs, select appropriate storage types based on business requirements, use AZCopy, Storage Explorer and Redgate Azure Explorer to migrate data, design cloud solutions that integrate with on-premises data

You can also view the full objectives list for the 70-776 Performing Big Data Engineering on Microsoft Cloud Services certification exam on the official 70-776 exam page at Microsoft.com.

Training Material

There are not very many study / training materials designed specifically for the 70-776 Perform Big Data Engineering on Microsoft Cloud Services certification exam. As example, Microsoft Press has not published any Exam Ref or Guide books for this exam yet (at the time of writing this.) However, there is still plenty of material available from various sources, both Free and Paid, that range from official product documentation from Microsoft, other articles and videos from Microsoft, as well as training from companies like Opsgility and their SkillMeUp service.

Another interesting resource to utilize is the following recording of the 70-776 Cert Exam Prep session given by James Herring at Microsoft Ignite 2017:

Happy Studying!!

Architecturedatabase

Designing Globally Resilient Apps with Azure App Service and Cosmos DB

It’s so quick and easy to deploy an application out into Microsoft Azure and make it available for anyone in the world to use. It’s even quicker if you utilize all the Platform as a Service (PaaS) services like Azure App Service (Web Apps, API Apps, Logic Apps, etc) including Azure SQL Database and Azure Cosmos DB. However, it can be a bit more tricky to make that application resilient to failure, specifically regional failure. How do you design an application to be truly globally resilient? What if a specific data center or region goes down? Will your application stay up and keep your users productive?

You can add high availability by increasing the number of instances, but that only applies to a single region. You could implement failover, but does that offer the best experience for your users? This article goes through many of the tips and techniques that can be used within Microsoft Azure to build truly globally resilient applications. Read More

ArchitectureBig DataBlockchaindatabase

Blockchain is not just for Cryptocurrency, but for Enterprises and the Future

The term “Blockchain” has been gaining a lot of buzz; especially in more recent time. It all started with Bitcoin and Cryptocurrencies, but has taken off as a pretty remarkable advancement in secure communication and database technology. Blockchain is not just for cryptocurrencies, and many Enterprises are seeing how it can be applied to the business world. With it’s link to cryptocurrency, it seems pretty obvious that it could be used for the Financial industry, but that’s really only the tip of the iceberg (so to say). There are MANY other potential uses for Blockchain within various IT solutions that transcend what you might think at first glance. This article describes what Blockchain is, where it came from, and how it can be utilized in some really innovative ways. Read More

Certificationdatabase

70-473 Cloud Data Platform Solutions Exam – June 2017 Update

The 70-473 Designing and Implementing Cloud Data Platform Solutions certification exam was first published Oct. 27, 2015. A lot of things with Microsoft Azure have changed in the time since it was first published. For this reason Microsoft does publish updates to the various certification exams, and this past June 2017, Microsoft published an update to the exam to bring it in line with the current state of the Azure data platform. This article outlines the current state of this certification exam. Read More

databaseportal

CosmosDB: The New DocumentDB NoSQL Database in Microsoft Azure

DocumentDB has been around for awhile now in Microsoft Azure. It’s a Document based, NoSQL database in the cloud. There’s been tons of advancements to the service over time, including MongoDB API support so you can use it in place of MongoDB for existing code bases. It’s always been called “DocumentDB” since initial release of the service, but for a time it was labeled as “NoSQL (DocumentDB)” in the Azure Portal. It seems there was some indication that Microsoft wasn’t happy with the name they first chose.

Today, we wake up in the morning to updates that have been made to the Azure Portal where DocumentDB is no longer there. Well, it actually is there, but has undergone renaming / rebranding. From this day forward, DocumentDB will no longer be called DocumentDB. Instead we will call this NoSQL, Document based database service….. Azure Cosmos DB.

From this day forward, DocumentDB will no longer be called DocumentDB. Instead we will call this NoSQL, Document based database service….. Azure Cosmos DB.

This naming change to CosmosDB isn’t the only thing released. There’s the all new “Data Explorer” UI in the Azure Portal that makes it a bit easier to use DocumentDB… ahem… CosmosDB too!

In the sessions and documentation that will come out of Microsoft and from the Microsoft Build 2017 conference which starts today, I’m sure we’ll hear all about these changes coming to our favorite NoSQL, Document store in the Microsoft Azure cloud.

For now, here are some link to additional artifacts I’ve found that show evidence that Microsoft is in fact renaming DocumentDB to CosmosDB:

Here’s the description in the Azure Portal when you search for “CosmosDB” in the Azure Marketplace:

Azure Cosmos DB is a fully managed, globally-distributed, horizontally scalable in storage and throughput, multi-model database service backed up by comprehensive SLAs. Azure Cosmos DB is the next generation of Azure DocumentDB. Cosmos DB was built from the ground up with global distribution and horizontal scale at its core – it offers turn-key global distribution across any number of Azure regions by transparently scaling and replicating your data wherever your users are. You can elastically scale throughput and storage worldwide and pay only for the throughput and storage you need. Cosmos DB guarantees single-digit millisecond latencies at the 99th percentile anywhere in the world, offers multiple well-defined consistency models to fine-tune for performance and guaranteed high availability with multi-homing capabilities – all backed by industry leading service level agreements (SLAs).

Cosmos DB is truly schema-agnostic – it automatically indexes all the data without requiring you to deal with schema and index management. Cosmos DB is multi-model – it natively supports document, key-value, graph and columnar data models. With Cosmos DB, you can access your data using NoSQL APIs of your choice — DocumentDB SQL (document), MongoDB (document), Azure Table Storage (key-value), and Gremlin (graph), are all natively supported. Cosmos DB is a fully managed, enterprise ready and trustworthy service. All your data is fully and transparently encrypted and secure by default. Cosmos DB is ISO, FedRAMP, EU, HIPAA, and PCI compliant as well.

Happy discovering new features!!

Certificationdatabase

MCSA: Data Engineering with Azure Certification from Microsoft

There have been many changes to the Microsoft certification program lately. Among these changes have been brand new MCSA and MCSE certifications, including MANY new exams. Among these changes is the recent addition of the MCSA: Big Data Engineering certification.

Update May 27, 2017: Microsoft announced that they’ve renamed the MCSA: Big Data Engineering certification to MCSA: Data Engineering with Azure.

Read More

databasePaaSVideo

Create Azure SQL Database in the Azure Portal

Here’s a short video that shows you how to create an Azure SQL Database in the Azure Portal. It also explains how to connect to the database, and how the relationship between Azure SQL Databases and Azure SQL Servers works. Additionally, a few other features are overviewed such as Geo-Replication, Transparent Data Encryption and others.

Please, subscribe to get more videos like this all around Microsoft Azure.

This is one of the first videos I’ve published to the Build Azure YouTube Channel where I’m starting to build out video content to accompany this site. Enjoy!