DEV Community

Awasume Marylin
Awasume Marylin

Posted on

Week 4 of my Data Analytics Journey

I am now at the ANALYSIS PHASE.
 Recall
➡️ ASK PHASE
➡️ PREPARE PHASE
➡️ PROCESS PHASE
⭕ ANALYSIS PHASE
➡️ SHARE PHASE
➡️ ACT PHASE
 This is the process used to make sense of the data collected. The goal of analysis is to identify trends and relationships with in data so you can accurately answer the questions you are asking.
 
 The four phases of analysis
 💠 Organize data
 💠 Format and adjust data
 💠 Get input from others
 💠 Transform data
This phase includes SORTING and FILTERING 
 - SORTING is the process of arranging data into a meaningful order to make 
 it easier to understand, analyze and visualize
 - FILTERING is used to show only the data that meets a specified criteria while 
 hiding the rest. It is useful when you have a lot of data.
 I also did practical work on SORTING and FILTERING data using MICROSOFT EXCEL and SQL
 CASE STUDY: MOVIE DATA, COUNTY NATALITY.
In SQL, i used the 'ORDER BY' and 'WHERE' clauses to sort and filter data.
i learnt FORMATTING FOR BETTER ANALYSIS.
 ➡️ Convert functions to change units of measurement ( in Excel)
 
Convert data in spreadsheet
➡️ Convert from STRING to DATE, Converting TEXT values containing both 
 DATES and TIME
➡️ Convert from STRING to NUMBER
➡️ Combining columns (merge text from 2 or more cells) using AMPERSAND
 SYMBOL and CONCATENATE
➡️ Convert from NUMBER to PERCENTAGE
 DATA VALIDATION IN EXCEL
 You can use data validation in excel to
 - Add dropdown lists with predertermined options
 - Create customs checkboxes
 - Protect structured data formulas
 CONDITIONAL FORMATTING
 A spreadsheet tool thst changes how cell appear when values meet specific
 conditions.
 Transforming data type in SQL using the CAST FUNCTION
 - Converting a NUMBER to a STRING
 -Converting a STRING to a NUMBER
 -Converting DATE to STRING
 -Convert DATE to DATETIME
 I also learnt how to combine 2 or more cells in SQL using the CONCATENATE FUNCTION.
 Using the CONCATENATE FUNCTION. In order combine data helps analyst to properly and efficiently analyze data, the data has to be clean and understandable. Data anaylyst use functions like CONCAT to make data easier to work with, which may require combining multiple cells.
i learnt about DATA AGGREGATION, VLOOKUP and JOIN 
DATA AGGREGATION is the process of gathering data from multiple sourrces in order to combine it in to a single summarized collection 
 - Puzzle pieces = data
 - Organization = aggregation
 - Pile of piece = Summary
 - Putting the pieces together = gaining insights
 So DATA AGGREGATION helps analyst to 
 ➡️ Identify trends 
 ➡️ Make comparisons
 ➡️ Gain/insights
 Data can also be aggregated over a given time period to provide statistics such as
 ➡️ MINS
 ➡️ AVERAGES
 ➡️ MAXS
 ➡️ SUMS
 VLOOKUP stands for Vertical lookup. It is a function that searches for a certain 
 value in a column to return a corresponding piece of information.
 VLOOKUP can match 2 sheets together on a matching column to populate on a single sheet .
It only returns the data it finds to the right it cant look left. 
JOIN is an sql clause that is used to combine rows from 2 or more tables based on a relatable column. It is like VLOOKUP in sql.
 COMMON JOINS
 💠 INNER JOIN: A function that returns records with matching values in 
 both tables.
 💠 LEFT JOIN: A function that will return all the records from the left table 
 and only the matching records from the right table
 💠 RIGHT JOIN: Return all records from the right table and only the 
 the matching records from the left
 💠 OUTER JOIN: Combines right and left join to return all matching records 
 in both tables
 I also learnt about the importance of ALIASES.
 ALIASES are used in sql queries to creat temporary names for a column or
 table. They are implememnted by making use of AS command.
I learnt about ,
 SUBQUERIES: They are queries within queries. They are called an Inner or Nested query. 
 - They can make projects easier and more efficient by allowing complex operations to be performed in a single query, reducing the need for multiple trips to the database.
 - They also make your code readable and maintainable
 Rules to follow to use subquery
 ➡️ Subqueries must be enclosed within parentheses.
 ➡️ A subquery can have one or more columns specified in the select
 clause
 ➡️ Subqueries that return more than one row can only be used with
 multiple values operators, such as the IN operator which alows
 you to specify multiple values in WHERE clause.
 ➡️ Subquery cant be nested in SET command.
 I also learnt about PIVOT TABLES in google sheets. They make it possible to view data in multiple ways in order to identify insights and trends. 
 
 They can help you quickly make sense of larger datasets by comparing metrics, performing calcularions and generating reports. 
 Pivot tables have 4 basic parts
💠 ROWS: Organize and group data you select horizontaly
💠 COLUMNS: Organize and displays values from your data vertically.
💠 FILTERS: Enables you to apply filters based on a specific criteria
💠 VALUES: They are used to calculate and count data.
I also learnt about DATA VALIDATIONS and the TYPES OF DATA VALIDATION
DATA VALIDATION is the process of checking and rechecking the quality of your data so that it is complete, accurate, secure and consistent.
 TYPES OF DATA VALIDATION
 ⭕ DATA TYPES :Check that the data type matches the data type defined for a field 
 ⭕ DATA RANGE: Check that the data falls within an acceptable range of values defined for the field.
 ⭕ DATA CONSTRAINTS: Check that the data meets certain conditions or criteria for a field. This includes the type of data entered as well as other attributes of the field such as number of characters.
 ⭕ DATA CONSISTENCY: Check that the data makes sense in the context of other related data.
 ⭕ DATA STRUCTURE: Check that the data follows or conforms to a set structure
 ⭕ CODE VALIDATION: Check that the application code systematically performs any of the previously mentioned validation during user data input.
 
 I also learnt about TEMPORARY TABLE
 
 They are tables that is created and exists temporarily on a database server. You create a temporary table in SQL using the WITH clause which is a type of temporary tables that you can query from multiple times .They are extremely helpful for complex calculations queries.

Top comments (0)