Working in Excel
Mini skill case study.
MINI SKILL STUDYEXCELDATA VISUALISATIONDATA CLEANING
Fynn Burgess
11/21/20233 min read
Working in Excel
Please be aware that this dataset was created using AI and may not faithfully replicate real-world situations. While it can be utilized for learning purposes, the outcomes derived from it might not align logically or consistently with real-world scenarios. Exercise caution when drawing conclusions or making decisions based on this dataset, as its AI-generated nature could result in data patterns or conclusions that might not align with practical or logical expectations.
About the dataset:
The dataset contains detailed information on various vehicles, encompassing different manufacturers, models, manufacturing years, vehicle types, prices in Australian dollars (AUD), available stock counts, and yearly sales figures for the year 2022.
Each entry in the dataset includes the following attributes:
· Vehicle ID: A unique identifier for each vehicle.
· Manufacturer: The brand or company producing the vehicle.
· Model: The specific name or model of the vehicle.
· Manufacturer, Model: A combined field indicating both the manufacturer and model.
· Year: The manufacturing year of the vehicle.
· Type: Categorization of the vehicle type, such as Sedan, Hatchback, SUV, Pickup, Coupe, Minivan, Wagon, Van, Convertible, or Minivan.
· Price (AUD): The price of the vehicle in Australian dollars.
· Stock Available: The count of available units in stock.
· Yearly Sales (2022): The number of units sold in the year 2022.
This dataset encompasses a range of vehicles, offering insights into the used dealerships’ diversity, pricing trends, and sales performance for different manufacturers, models, and vehicle types. It's structured to facilitate analysis and exploration of the automotive market landscape for various purposes such as stock research, pricing strategy, or sales performance evaluation. However, it's essential to note that as an AI-generated dataset.
The three main sheets in the workbook:
Dashboard:
The Dashboard sheet serves as a visual representation of key metrics, trends, and data summaries from various sources within the Excel workbook. This includes a stock vs sales bar chart, sales vs Manufacturer, Model bar chart, AVG sales vs price scatter plot, AVG price over Year manufactured line chart, sales vs Year manufactured line chart & slicers/filters.
Working_Database:
The Working_Database sheet functions as the primary data manipulation area. It contains raw or structured data imported from external sources or manually entered. This sheet serves as the foundational dataset for analysis, calculations, and creating reports within the Excel workbook.
Pivot_Table:
The Pivot Table sheet is designed to create dynamic analyses of the data stored in the Working_Database sheet. It allows users to perform quick data analysis. Using pivot tables to visualize trends by filtering and manipulating the data. The generated visualisations help make the dashboard sheet.
Original_Dataset:
The Original Dataset sheet stores the initial, unaltered data imported directly from the AI data source. This sheet serves as an archival or reference point, preserving the raw data before any modifications, transformations, or analysis take place. It ensures the availability of the original data for comparison, auditing, or as a source for reprocessing in case of any changes made in subsequent analyses. It’s also important to mention that a copy of the original dataset is also stored in other Excel workbooks for redundancy purposes.
Other sheets:
serve as compartments for distinct stages in data cleaning. These sheets segregate tasks like resolving nulls, concatenating, formatting, and more. They streamline the process, allowing focused cleaning without cluttering the working sheet, ensuring an organized and efficient data exploration journey.
Things I learned:
1. Creating dashboards in Excel (I prefer Tableau & will be making a mini case with this dataset in Tableau in the coming days).
2. Using pivot tables & their limitations. More evidently for me in this study was not being able to use scatterplots??? If I copy the exact same data outside a pivot table, it worked?? More investigation is needed later.
3. Making use of CSV files to add or remove data. This was especially important when working with data from AI & the AI’s text output limits.