If you're in a Data Scientist role at a company, you've likely heard of, worked with, or discussed Big Data and the plethora of tools available for Extraction, Loading, and Transformation. Often, this involves using popular coding languages and frameworks to interpret raw data.
A DataFrame is a type (or object, like currency numbers) that allows software developers to manipulate raw data using its instance functions. I'm pleased that the Swift TabularData Framework has adopted conventions similar to those of Pandas and Numpy.
If you're a software developer or programmer, you probably know that Swift is a renowned programming language primarily used for developing software applications for the iPhone, Apple Watch, and MacBook. Over the past five years, its availability has expanded to other platforms and operating systems, including Windows, Ubuntu, CentOS, and Amazon Linux.
Back to TabularData Framework!
Why do we need a framework instead of SQL (Structured Query Language)? It boils down to performance and potentially reducing context-switching for developers.
This:
dataFrame.description(options: formattingOptions)
It is different from this:
SELECT * FROM TABLE_NAME
The former queries unstructured data, which is distinct from the latter's structured data. This distinction brings about differences in patterns, speed đ˘, and costs đ°.
If you're curious about the differences mentioned, check out my article on how I achieved this using AWS Redshift and Spark
The TabularData Framework enables Swift developers to work with Big Data on user's devices or servers.
In the above snapshot, I mapped the date from a `String` type to a `Date` type (object) to work with it in my Swift application.
Comments