Reason: Why the software history was not kept?

Created on 21 Jul 2016  路  3Comments  路  Source: reasonml/reason

Hi there,

I'm a researcher studying software evolution. As part of my current research, I'm studying the implications of open-sourcing a proprietary software, for instance, if the project succeed in attracting newcomers. However, I observed that some projects, like _reason_, deleted their software history.

https://github.com/facebook/reason/commit/1037c7dd36dfe94d7f03a7cb4a7c073f5d3cb840

Knowing that software history is indispensable for developers (e.g., developers need to refer to history several times a day), I would like to ask reason developers the following four brief questions:

  1. Why did you decide to not keep the software history?
  2. Do the core developers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
  3. Do the newcomers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
  4. How does the lack of history impacted on software evolution? Does it placed any burden in understanding and evolving the software?

Thanks in advance for your collaboration,

Gustavo Pinto, PhD
http://www.gustavopinto.org

Most helpful comment

Hey! I'll answer this one, but note that my views are strictly my own and do not necessarily represent the views of my employer blablabla...

  1. In general, a project like Reason (or React) doesn't always intentionally erase history. A large part of it might have been developed internally at Facebook in a different, GitHub-like tool called Phabricator. Once it's battle tested, we release it to the public, thinking that it'll provide values. I personally actually prefer this way of doing things, rather than starting from scratch in the open, which fatigues people externally and internally (e.g. communication & maintenance churn). As a rule of thumb, facebook/ contains projects actively used by Facebook. This is good for reputation building as well imo.

You could argue that we can still sync out the history; The more important aspect of this is that, because it's used internally, PRs/commits/discussions contain security sensitive information. Open-sourcing a code dump already needs to go through some security/privacy checks. Going through potentially thousands of commits, realistically, means no one will take on the heroic task of even open-sourcing the product. When React was open-sourced, Pete Hunt had to shake it off of internal dependencies. Had he not been there, the few contributors React had would probably have done it but later.

Also, remember that often something started as one person's random weekend project. Keeping a pristine history might not have been a priority.

  1. I've personally gone through this with the React repo. I mostly just muddled along and check either the public GitHub repo or the internal one that kept the info. Didn't have to go through this process THAT much so this is less of a problem than it seems, I think. Additionally, the code should ideally have enough comments to be understandable by itself (aka encode the "why"), rather than relying on some meta-code stuff (e.g. commit messages, history, informal discussions in internal tasks, etc.) that, like you said, get lost throughout transfers (but that's just an opinion).
  2. I was such a newcomer. See #2.
  3. They get override pretty quickly by external patches. It'd be nice if the language itself has facilities to track evolution. Right now we rely on comments, unit tests and meta-code means. Regressions do still happen, but I think this is a reasonable compromise given #1.

Does this answer your question? If so, feel free to close this issue =).

All 3 comments

Hey! I'll answer this one, but note that my views are strictly my own and do not necessarily represent the views of my employer blablabla...

  1. In general, a project like Reason (or React) doesn't always intentionally erase history. A large part of it might have been developed internally at Facebook in a different, GitHub-like tool called Phabricator. Once it's battle tested, we release it to the public, thinking that it'll provide values. I personally actually prefer this way of doing things, rather than starting from scratch in the open, which fatigues people externally and internally (e.g. communication & maintenance churn). As a rule of thumb, facebook/ contains projects actively used by Facebook. This is good for reputation building as well imo.

You could argue that we can still sync out the history; The more important aspect of this is that, because it's used internally, PRs/commits/discussions contain security sensitive information. Open-sourcing a code dump already needs to go through some security/privacy checks. Going through potentially thousands of commits, realistically, means no one will take on the heroic task of even open-sourcing the product. When React was open-sourced, Pete Hunt had to shake it off of internal dependencies. Had he not been there, the few contributors React had would probably have done it but later.

Also, remember that often something started as one person's random weekend project. Keeping a pristine history might not have been a priority.

  1. I've personally gone through this with the React repo. I mostly just muddled along and check either the public GitHub repo or the internal one that kept the info. Didn't have to go through this process THAT much so this is less of a problem than it seems, I think. Additionally, the code should ideally have enough comments to be understandable by itself (aka encode the "why"), rather than relying on some meta-code stuff (e.g. commit messages, history, informal discussions in internal tasks, etc.) that, like you said, get lost throughout transfers (but that's just an opinion).
  2. I was such a newcomer. See #2.
  3. They get override pretty quickly by external patches. It'd be nice if the language itself has facilities to track evolution. Right now we rely on comments, unit tests and meta-code means. Regressions do still happen, but I think this is a reasonable compromise given #1.

Does this answer your question? If so, feel free to close this issue =).

thank you!

Hi @chenglou,

thanks once again for answering our research inquiries. We were able to collect 35 responses and we drafted a research paper with the results. The paper was submitted and accepted for the 14th International Conference on Open Source Systems (http://oss2018.org/). You can find the paper here. Hope you enjoy reading the paper!

Thanks again,

Gustavo

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ostera picture ostera  路  3Comments

rickyvetter picture rickyvetter  路  4Comments

rickyvetter picture rickyvetter  路  3Comments

TrakBit picture TrakBit  路  3Comments

modlfo picture modlfo  路  4Comments