Wednesday, July 8, 2020
Wednesday, June 17, 2020
Linked-In | Email Me
Anomaly detection by Aashrith Sai
According to me anomaly is anything that is deviating from what is expected or normal or standard. In every organization or a business, everything is planned to meet the expectations of the clients or consumers. In all these scenarios one must be aware of the anomalies to successfully execute their plan. The more we know about the probability of an event happening in the future, the more prepared we can be in the present to handle the situation in future.
Real-Time Scenario: Let us look at an example of a grocery store.
Fig1: Every weekend there are a lot of customers.
Fig 2: The weekend when there was a thunderstorm.
As the number of customers are very high on the weekends the grocery store owner hires more staff during the weekends and purchases more goods from the vendors. The owner would have saved a lot of money if he had a prior idea of the thunderstorm affecting his sales. In this case a thunderstorm is an anomaly. Lot of resources can be saved and plans can be executed more efficiently if we have the information about the anomalies happening in future, present and past.
Internship experience at Alexis:
Alexis Networks is the first company I am working with in the US. Through the last few weeks I have been given an opportunity to explore a lot of new technologies in the field of machine learning, cloud and many more that are essential for product development. Through this opportunity, I have discovered my new interests towards Amazon AWS which is driving me to work on getting certified as an AWS machine learning developer. I am thoroughly enjoying the freedom given to me to explore new paths to have a complete learning experience.
My excitement towards new projects, products and tasks reasserted the fact that I am in the right direction of exploring technology and complementing my skill-set to become successful in my career.
What have you learned thus far compared to what you initially expected?
I always wanted to work on products that would impact people to make their life more easier and comfortable. At Alexis, I am working on exactly what I wanted to do. I wanted to get a real time experience with deep learning, which I have developed in the initial few weeks of my internship to detect human anomalies. I have been working and learning more about AWS cloud for machine learning and deep learning applications. I am learning how to package and deliver products to the clients from the stage of development to deployment. I am really enjoying this process and never expected that this opportunity would open up so many paths to learn and explore.
My cough and sneeze detection model:
What has your experience been like during COVID19 and how has it impacted you?
It is always good to have people around you and discuss new ideas and share each other's experiences. I am glad that though this situation is keeping us all at our home, we can still connect with each other and discuss the blockers using social media and virtual meets. I am really happy on how everyone is available and contributing to make my internship experience complete.
Wednesday, October 30, 2019
A modern technology-driven business is awash in mountains of sorted and unsorted data that can be challenging to transform into something that allows for deep analysis. From a development perspective, defining, building, and normalizing or cataloging data in a way that allows for easier analysis is a daunting task, especially if new data sets are identified as needing analysis.
Developers or database administrators have to create processes to extract the information from whichever repository it is stored in, transform it into a common set of data types or even fields, and ultimately load that information in an indexed format to a location that can then be accessed by business intelligence toolsets.
It is this Extract/Transform/Load (ETL) process that typically is a bottleneck in the overall analysis process. Besides the need to create a new process each time a new dataset is identified, the need to set up routines to run these ELT processes at set times when certain conditions are met can be a maintenance nightmare.
The Benefits of AWS Glue
With the introduction by Amazon Web Services (AWS) of a service called AWS Glue, this formerly painstaking task has been eliminated. By integrating closely with other key AWS services, such as DynamoDB and other RDS database interfaces, Glue allows an organization to simply point to the location where the raw data resides and Glue will take care of the extraction, transformation and loading of the data into a format whereby data analysis tools can access it.
Moreover, since AWS Glue provides a serverless environment where clients only pay for the resources they use at the time Glue is invoked to process data, and by generating ETL code that is customizable, reusable and portable, it gives developers in an organization more freedom to sculpt processes that suit each particular need.
How Alexis Networks Applied AWS Glue to Two Different Use Cases
Alexis Networks, an AWS Certified Premier Partner, was asked by two clients on ways they could best utilize AWS Glue to solve ongoing challenges they were experiencing within their organizations. Even though each client had very different concerns when attempting to aggregate their data sources together, Alexis Networks was able to utilize AWS Glue in a very similar manner to overcome their challenges.
Client 1: How to Manage and Aggregate Widely Different Data Sources
The first client needed to compile a lengthy amount of data from numerous data sources into one location so that it could be analyzed. Their challenge was that they were not able to accurately predict decisions based on the customer data. For them, the complexity of the data meant a significant amount of resources were needed to cleanse the data and they struggled to find ways to bring the data together in a fashion that was easily understood.
With AWS Glue, Alexis Networks was able to define custom scripts that could identify all of the various data sources the client was reliant on and bring them together, cleanse and transform the data, enrich the datasets, and then divide them into Hive tables for use later on. This meant that the complexity of the data sources no longer stood as a barrier for the client and ultimately led to a clearer understanding of customer predictability.
Client 2: Extreme Load Latency Means Greater Costs and Inefficiencies
The second client had challenges in ingesting all of the data they needed to. With only two instances of their data available at any point in time, a bottleneck occurred within RDS each time they needed to consolidate the data into a centralized location. This in turn meant that hours would be spent as the data was aggregated together which meant that other activities around reporting were impacted, ultimately leading to higher operational costs. They were interested in finding ways to speed the process up and reduce overall costs.
Alexis Networks implemented AWS Glue for this client as well. Because of its inherent distributed nature within the AWS Service ecosystem, Glue was able to take all of the client’s large volume of data from disparate sources and load them within minutes instead of hours. Because AWS only charges for the time that Glue operates, and because the low cost for utilizing the service is built into the pricing model, the client was able to recognize an immediate reduction in overall operational costs while streamlining the data ingestion process.
Architecture Diagram (For Client 1 and Client 2)
Alexis Networks Charts the Path
Like any of the AWS Services, not every service fits every client’s need. Depending on the complexity and specific needs of the client, Alexis Networks can assess the need and determine which approach and which services are in the best interest of the client for long-term success.
With AWS Glue, the implementations Alexis Networks executed showed that both clients were able to recognize the immediate benefit of the solution. Delivering Glue meant very low costs for ongoing operations, high data ingestion speeds because of the distributed nature of the AWS landscape, and consistent cleansing and transformation paradigms that were reproducible using customizable ETL code which was reusable going forward. Leveraging Alexis Networks as a partner in your own company’s journey means that the outcome will benefit your organization, your infrastructure, and your customers for years to come.
Caught between higher demand and tightening supply, insurers are looking to technology to compute adequate rates and come up with new types ...
Aashrith Sai About Myself Currently pursuing my masters with keen focus on Deep learning and data science. I am passionate abou...
Apache Spark has a colossal importance in the Big Data field and unless one is living under a rock, every Big Data professional might have ...
The Challenge A modern technology-driven business is awash in mountains of sorted and unsorted data that can be challenging to transform ...