Parse Big Data with Swift TabularData Framework 💿

If you have a Data Scientist role for a company, you might have heard, worked, or spoken about Big Data and the vast amount of tools available to Extract, Load, and Transform. You might perform this with popular coding languages and frameworks to make sense of raw data.


DataFrames is a type (object) that facilitates software developers to manipulate the raw data with all its instance functions. I'm happy that the Swift TabularData Framework followed the same convention that Pandas and Numpy have.


If you are Software Developer/programmer, you might know that Swift is a programming language that is very popular for developing software applications for the iPhone, Apple Watch, and MacBook. It has become available in the past five years on more platforms and operating systems: Windows, Ubuntu, CentOS, and Amazon Linux.


Back to TabularData Framework!


So why do we need a framework vs. SQL (Simple Query Language)?

It's all related to performance and maybe context-switching for the developer.


This:

dataFrame.description(options: formattingOptions)

It is different from this:

SELECT * FROM TABLE_NAME

The one above is querying through unstructured data, which differs from structured data (the one below). And that also entails different patterns, speed 🐢, and costs 💰


If you are interested in the previous difference, read my article on how I accomplished this using AWS Redshift and Spark.


The TabularData Framework enables Swift developers to work with Big Data either on the user's devices or the server.


2022 City Of Chicago Divvy Bike Trip Data

In the above snapshot, I mapped the date from a `String` type to a `Date` type (object) to work with it in my Swift application.