Skip to main content

What is the architecture of Azure Data Lake?

What is the architecture of Azure Data Lake?
Azure Data Lake is designed with 2 major components, data lake store and analytics. And majorly there are below structure:

1.) Internal system - YARN & WebHDFS. Yarn - Analytics  & WebHDFS - Hadoop hdfs storage.
2.) Analytics - USQL   
3.) Compute Engine - HdInsight (Big Data batch processing).
3 Azure Data Lake Store (ADLS) serving as the hyper-scale storage layer.

What can I do with Azure Data Lake Analytics?
·         Right now, ADLA is focused on batch processing, which is great for many Big Data workloads.
·         Prepping large amounts of data for insertion into a Data Warehouse
·         Processing scraped web data for science and analysis
·         Churning through text, and quickly tokenizing to enable context and sentiment analysis
·         Using image processing intelligence to quickly process unstructured image data
·         Replacing long-running monthly batch processing with shorter running distributed processes
ADLA is well equipped to handle many of the types of processing we do in the T portion of ETL; that is, transforming data. If you've found that your data volumes have increased, changed shape, or you are generally not happy with your ETL performance, ADLA might serve as a good replacement for your traditional approach to prepping data for analysis.

Thanks for reading
Plz dont forget to like Facebook Page..


Popular posts from this blog

mongoDB error : aborting after fassert() failure

What to do when facing errors on mongoDB “aborting after fassert() failure” I like errors, in mongoDB this is the first error I faced and luckily many times. This error i faced during restoring name-space on local and restarting db system. I am still searching the exact root cause of this issue but i am able to resolve the current problem through below steps. Remove all relevant namespace files from data-file route path.. Now repair mongo instance using mongod process. mongod --repair ////////// execute command from bin folder path  Then start server using mongd process, if started server successfully then .. mongod  ////////// execute command from bin folder path Restore last backups as normal process. Now check database by connecting mongo shell. Thanks for reading,  Please comment your experience if you faced and also share knowledge if you have better steps to resolve...  

SQL71562: external references are not supported when creating a package from this platform

Last week I got this error from one of developer who was trying to deploy his project from Testing server to SQL Azure QA server. He was using “Deploy Database to SQL Azure” option from SSMS Tool-Task option. After connecting to SQL Azure portal when operation started to deployment below errors occurs. Validation of the schema model for data package failed. Error SQL71562: Error validating element has an unresolved refrence to object xx.dbo.xxxx external refrences are not supported when creating a package from this platform . Reason: The reason of the this error was; some functions of project was dependent on master database and only single database was being deploy to SQL Azure. DACFx must block Export when object definitions (views, procedures, etc.) contain external references, as Azure SQL Database does not allow cross-database external references So, this error was coming. Solution : I suggested him to create those function to locally

How to add an article in Transactional Replication

If we have a set-up of Transactional Replication for Data Distribution running and wanting to add new object to replication on other server we can follow below process. To add an article In Transaction replication with PUSH Subscription