Gitmoji: ๐Ÿง Data exploring/inspecting

Created on 14 Nov 2020  ยท  12Comments  ยท  Source: carloscuesta/gitmoji

Hello @carloscuesta :sunglasses:!

  • Emoji: ๐Ÿง
  • Code: :monocle_face:
  • Description: Data science is a fast growing field in the software industry. Most practices in data science may be translated to already existing gitmojis (new architecture -> new features etc.) but the practice of exploring and inspecting data, including exploratory data analysis are practices with no parallel meaning in classic software engineering. Since writing notebooks to inspect input data, features, model results and alike are very common in data science I think they deserve their own gitmoji :)

About testing, I read the contribution guide and haven't seen anything about that... So how should I go about testing?

emoji

Most helpful comment

So lets vote on the emoji :sweat_smile:

|vote|emoji|
|-|-|
| :rocket: | :monocle_face: :monocle_face:|
| :heart: | :detective: :detective:|
| :tada: | :microscope: :microscope:|

All 12 comments

About testing, I read the contribution guide and haven't seen anything about that... So how should I go about testing?

Sorry, what do you mean?


Also, I'm not too knowledge in data science, and I didn't understand why this data exploration/inspecting would need a commit? And if you mean adding another grapth and/or another paragraph to a notebook, then couldn't we use :memo: or :sparkles: ?

Sorry, what do you mean?

I saw that my PR did not pass all tests, so I was wondering about how to make them pass - and stated I didn't find anything in the documentation


About why we need a gitmoji about data exploring, let's understand how data exploration is done.
In data science, we explore data through code, ususally in jupyter notebooks - see example.

These kinds of explorations are made in parts, much like any other piece of code. So you might want to commit things like "Added distribution plots for feature X" and then maybe "Exploration of textual data points" and stuff like that.

As I see it, there are no existing gitmoji equivalent since these are not new "features" but rather the act of exploring existing ones - this also extends to exploring results of model's predictions and similar things.

To sum up, the exploring/inspecting is instead of introducing new features is a code oriented way to explore existing ones

Hey @eliorc ๐Ÿ‘‹ Thanks for opening a PR. Like @vhoyer I don't know a lot about data science's workflow and can't really understand what you mean with 'data exploring'. Is it just about new code that do something? Does it create new files? What exactly happens when you do data exploring in term of code or file manipulation?

So about the tests, the thing that is breaking it probably the snapshots, I will assume you are not familiarized with those kinds of tests, and I like the ideia of adding this info on the repo, so I will do this later (mostly because this is not the first time this raises doubt haha), but to resolve your test errors, you should open the repo and run npm install && npm test -- -u; # then commit the changes probably using :camera_flash:. Of course, to do so you will need node installed (which already installs npm alongside it).


I see, I get it now, but why wouldn't you consider adding new blocks in notebooks a new feature? it's adding a table or a graph to the document that was not there before, right? On that same line, what would configure a feature in those cases in your opinion?

@vhoyer

We consider features to be new "functionalities" created in the project. For example, a new model architecture, a new preprocessing pipeline a new evaluation scheme. Basically these are things that can be reused in the future once they were developed.

Explorations/inspections are a one-off action, and definitely not reusable, as the insights drawn from them are only relevant to the data they were made upon. You might reuse the features you developed to create those insights - but the usage of those features in order to generate the insights is a different thing.

If you want to find the equivalent in classic engineering, I would say that data explorations/inspections are like unit tests. It is an action that is executed in a certain point of time and has results - the test runs are not features just like the explorations are not (and again, maybe you develop features to later be used in the tests, but that's a different thing).

@johannchopin what happens is you write some code in an interactive notebook, you execute it and you most likely create plots and even write some markdown in order to convey your insights. You will usually do it in logical blocks for example let's say you have developed 100 different models - you will want to inspect and explore their predictions, looking for their weak points to have a clearer understanding on what should you do in your next research iteration. Also I explained this in two comments ago with examples

Ok, I agree with its inclusion in gitmoji, will take a look at your PR later, what do the rest of the gang think about it?

Ok I also agree with the integration of this emoji. But what would be the description exactly? Is there a better emoji for that because IMO ๐Ÿง isn't explicit enough.

Is there a better emoji for that because IMO :monocle_face: isn't explicit enough.

I have more candidates, but the monocle is my favorite - here are all of them sorted with my personal preference

  1. :monocle_face: :monocle_face:
  2. :detective: :detective:
  3. :microscope: :microscope:

But what would be the description exactly?

About that, I think "Data exploration/inspection" covers the use cases of exploring (like EDA and alike) and inspection (inspecting model results and comparisons etc.)

So lets vote on the emoji :sweat_smile:

|vote|emoji|
|-|-|
| :rocket: | :monocle_face: :monocle_face:|
| :heart: | :detective: :detective:|
| :tada: | :microscope: :microscope:|

Following up on @vhoyer comment,

It seems that we already have a winner ๐ŸŽ‰

Just saw you already opened the PR @eliorc. We should update it with the results of the poll!

Thanks ๐Ÿ™Œ๐Ÿป

Was this page helpful?
0 / 5 - 0 ratings