Fec-cms: Improve development workflow for CMS

Created on 7 Oct 2016  路  12Comments  路  Source: fecgov/fec-cms

So that developers can easily test new pages that require content from the database, improve the development workflow.

The main problem that needs to be addressed is the fact that when developing new features on the CMS that require content generated, it's difficult to get set up and keep track of where everything lives.

Needs refinement Back-end

All 12 comments

@ccostino and I figured out that a straightforward solution would be to have a database dump live in the repo (or a subset if it gets large enough). Any time there is database work, the dump would then be updated to reflect those changes.

Additionally, the dump would also be periodically updated as content gets added in.

We'll also write up a guide to document a proposed workflow for folks to follow and put it in the README or the wiki of the repo.

To add a bit of clarification to this, we do have some instructions for retrieving existing data found in our README. However, this still requires getting one's self setup with the app, running migrations, then getting the data and loading it; a lot of hoops to jump through.

I worked through the cloud.gov database export steps and was able to pull a full backup pretty easily. The problem with this is that the backup needs to be scrubbed of data in a few tables, namely django_session, anything starting with auth_*, django_content_type, and probably django_admin_log. Running the TRUNCATE command on these tables should be enough to do the trick. With that done, we should have a usable backup to build off of that contains all current migrations applied and any content produced.

One other thing to consider going down this route though is that the backup is for PostgreSQL. For local development we're using SQLite. Personally I don't see any issue with switching, but if we're going to do so then I propose we build a couple of tools for ourselves and others here much like we have in the API. This means the following:

  • Adding the backup (scrubbed of sensitive data) to our repo.
  • Creating a couple of invoke tasks to account for managing the backup (similar to the tasks we have in the API).
  • Switching the local development environment to use PostgreSQL instead of SQLite.
  • Update documentation accordingly.

This adds a bit of work to this story, but shouldn't take much more time (and we already have work to draw from in the API). If we do these things, we should be able to cut the number of commands down to just a couple that any given developer would have to run on a regular basis. The need for a full refresh of the backup shouldn't be as great or as often, either.

@xtine, what do you think?

  • Invoke tasks sound like a great way to go
  • I like the idea of switching to postgres, that may change the way we load data in the future. Though probably not too much. And I am guessing that it will be harrier than we estimate.
  • I am a bit weary of accidentally publishing something we shouldn't if we dump the db into the repo. Maybe we could do a private s3 bucket for the backup or pull from the invoke task, that way we don't put unpublished content in the repo, or miss some other edge case of something that shouldn't be public.

This sounds like a great plan. I'm interested in @LindsayYoung 's suggestion about using a private S3 bucket (if it doesn't add complexity to the setup).

I agree with @ccostino and @LindsayYoung's additional points. I really like the idea of changing local to postgres to get the developments synced up. If the commands are cut down to just a couple, then it could truly be a one-liner update by putting them into a script.

Oh good call, @LindsayYoung! I'm going to tag @jcscottiii here too because he's working on additional support for this very thing in cloud.gov. :-)

Small update on what needs to be cleared out in the database; only the auth_user and auth_user_groups tables contain info that would need to be removed it looks like as far as all of the auth_* tables are concerned. In thinking a bit more too, perhaps django_content_type doesn't need to be cleared out. That only becomes an issue when you're using the manage.py dumpdata and loaddata commands. Since we're re-using the database itself though, these should be okay as is.

And of course, anything that references a user via user_id and/or the auth_user table is going to pose an issue...

We've been talking about splitting this work up into discrete tasks, here is how I would break it apart:

  • Documentation updates (already underway: #578)
  • Django and Wagtail updates (already ~underway~ done: #602)
  • Switching SQLite to PostgreSQL
  • Adding invoke tasks for pulling and managing the backups (work with @jcscottiii on integrating a cloud.gov utility he is building for this very thing)

馃憤 @ccostino: let me know if you need any help on these items!

Closing this issue now that we have smaller tasks. Thanks!

Was this page helpful?
0 / 5 - 0 ratings