Easy-digital-downloads: Move logs to custom tables

Created on 28 Sep 2015  Â·  31Comments  Â·  Source: easydigitaldownloads/easy-digital-downloads

API requests are logged by default with the EDD_Logging class. Each site that uses the API has potentially thousands (millions?) of API requests that are going to get logged.

Currently they get stored in wp_posts/wp_postmeta.

For the most part, they just sit there taking up space and potentially (most likely) slowing down other queries on the site that pull from wp_posts/wp_postmeta.

I propose we move the request logs to a new custom table. This will provide the following advantages:

  • It will remove a TON of data from posts/postmeta
  • It will give us a second (customers was first) test process for the eventual move of payments to a custom table. Requests logs are largely unused so will be far more forgiving of any errors or mistakes made in the migration
  • We'll use one DB row per log entry instead of 5-10 across two tables
  • It will speed up API requests since we're creating one DB entry instead of 5-10
  • It will speed up display of the request logs
  • It will make it easier to introduce future enhancements (such as log pruning)

Pull Request: #6264

priority-high type-feature

Most helpful comment

object_type as varchar(200) seems too long. Should that be varchar(20) instead?

All 31 comments

Just curious, why limit this to only the API requests? Why not the whole EDD_Logging class? I've got a use case right now where I want to use the EDD_Logging class because it makes sense and is pretty easy to extend...except it stores all logs in wp_posts so...I have to be really careful about saving tons and tons of log entries there. A separate table for all logs would be uber much neato.

For this particular case, we want API logs separate from other logs. I can easily see us moving the main logging class too.

Proposed database schema

There are two new tables being introduced:

  • wp_edd_logs
  • wp_edd_logmeta

Schema for wp_edd_logs:

Field | Type | Null | Key | Default | Extra
-- | -- | -- | -- | -- | --
id | bigint(20) unsigned | NO | PRI | NULL | auto_increment
parent_id | bigint(20) unsigned | NO | | NULL |
type | varchar(30) | NO |   | NULL |  
title | varchar(200) | NO |   | NULL |  
message | longtext | NO |   | NULL |  
date_created | datetime | NO |   | |  

Schema for wp_edd_logmeta:

Field | Type | Null | Key | Default | Extra
-- | -- | -- | -- | -- | --
meta_id | bigint(20) | NO | PRI | NULL | auto_increment
edd_log_id | bigint(20) | NO | MUL | NULL |  
meta_key | varchar(255) | YES | MUL | NULL |  
meta_value | longtext | YES |   | NULL |  

I think the schema is good. To help confirm, here's a list of the log types I'm currently aware of that will be stored:

  • API requests
  • License activation / deactivation (or are these separate in EDD SL @cklosowski?)
  • Payment gateway errors
  • File downloads

We do have an opportunity here to merge logs and some notes together. We've talked about a dedicated notes table, but what if we instead made notes part of the logs table? It would make sense for at least some of the data that's currently stored as notes. For example, payments store "comments" whenever the status of a payment changes. Subscriptions from EDD Recurring do this as well. It makes logical sense for those to be logs as they're really a historical event much more so than a note.

I like it. We could link the payment by adding in an object_id column to the wp_edd_logs table so a payment note would look like this:

id | parent_id | object_id | type | title | message | date_created
-- | -- | -- | -- | -- | -- | --
1 | | 4 | payment | Status change | Status changed from Pending to Complete | 2017-12-25 01:54:00

One thing to note is that we currently a user ID can be attached to a payment note (because it's part of WP_Comment) - would we need to port that across if we decide to merge notes and logs?

We will need a creator_id, author_id, user_id, or similar to record who made a note.

What's a scenario where we need parent|child relationships for logs?

Ah, I forgot to remove that, not quite so sure why I added parent_id 😄

Proposed database schema (version 2)

There are two new tables being introduced:

  • wp_edd_logs
  • wp_edd_logmeta

Schema for wp_edd_logs:

Field | Type | Null | Key | Default | Extra
-- | -- | -- | -- | -- | --
id | bigint(20) unsigned | NO | PRI | NULL | auto_increment
object_id | bigint(20) unsigned | YES | MUL | NULL |
type | varchar(30) | NO |   | |  
title | varchar(200) | NO |   | |  
message | longtext | NO |   | |  
date_created | datetime | NO |   | |  
user_id | bigint(20) unsigned | NO | | 0 |

Schema for wp_edd_logmeta:

Field | Type | Null | Key | Default | Extra
-- | -- | -- | -- | -- | --
meta_id | bigint(20) | NO | PRI | NULL | auto_increment
edd_log_id | bigint(20) | NO | MUL | |  
meta_key | varchar(255) | YES | MUL | NULL |  
meta_value | longtext | YES |   | NULL |  

Note: the default user_id is set to 0. If this is the case, we assume it's EDD Bot.

@sunnyratilal In your example above I think payment should be payment_note.

Probably not a popular opinion, but I'm not convinced we should introduce a global log/note table. I agree with what Christoff said here, that payment notes should be in their own table.

Having a global logs table means we automatically have to query/filter based on the object type before we can use any of the data. That introduces overhead and the possibility of people making mistakes with the data. If we have to filter out unrelated data every time we fetch from the table, it's a good indicator the table structure isn't right.

Is there any use case where all the log data will be needed without filtering by type? If not, I think that's a good case for separate tables.

Also API logs, license logs, and download logs all have the potential to generate tons of entries. Along with payments, each of these logs types is slightly different and may require a different data structure down the road, if not from day one. Compromising on a more abstracted, global data structure will result in us relying on meta data for the case where there's not a column or where adding one doesn't make sense for all logs types, which will undo much of the performance benefits we get from going with custom tables.

Another thought. Conceptually it could be argued that logs and notes are different things, primarily that one is an automated activity log and the other is supplied by a human.

Also a note that with the new payments tables, it may not be necessary to
maintain sales logs. Since we will have a payment line items table to do
basically the same thing.

On Dec 27, 2017 8:54 AM, "John Parris" notifications@github.com wrote:

Probably not a popular opinion, but I'm not convinced we should introduce
a global log/note table. I agree with what Christoff said here
https://github.com/easydigitaldownloads/easy-digital-downloads/issues/5277#issuecomment-352960704,
that payment notes should be in their own table.

Having a global logs table means we automatically have to query/filter
based on the object type before we can use any of the data. That introduces
overhead and the possibility of people making mistakes with the data. If we
have to filter out unrelated data every time we fetch from the table, it's
a good indicator the table structure isn't right.

Is there any use case where all the log data will be needed without
filtering by type? If not, I think that's a good case for separate tables.

Also API logs, license logs, and download logs all have the potential to
generate tons of entries. Along with payments, each of these logs types is
slightly different and may require a different data structure down the
road, if not from day one. Compromising on a more abstracted, global data
structure will result in us relying on meta data for the case where there's
not a column or where adding one doesn't make sense for all logs types,
which will undo much of the performance benefits we get from going with
custom tables.

Another thought. Conceptually it could be argued that logs and notes are
different things, primarily that one is an automated activity log and the
other is supplied by a human.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/easydigitaldownloads/easy-digital-downloads/issues/3841#issuecomment-354133348,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABU2LJAV3Sc8g6hmE8qiVD6CXXRrZup-ks5tEmgagaJpZM4GFG4N
.

Imo: Payment notes should not be in the payments table and not in a global
notes table. There are things that people and even EDD uses on payments to
store stuff. Not just Edd bot for payment transaction id logging but also
things like fraud monitor log stuff into logs. Also I know a couple other
extensions do as well.

On Dec 27, 2017 10:56 AM, "Chris Klosowski" notifications@github.com
wrote:

Also a note that with the new payments tables, it may not be necessary to
maintain sales logs. Since we will have a payment line items table to do
basically the same thing.

On Dec 27, 2017 8:54 AM, "John Parris" notifications@github.com wrote:

Probably not a popular opinion, but I'm not convinced we should introduce
a global log/note table. I agree with what Christoff said here
digital-downloads/issues/5277#issuecomment-352960704>,
that payment notes should be in their own table.

Having a global logs table means we automatically have to query/filter
based on the object type before we can use any of the data. That
introduces
overhead and the possibility of people making mistakes with the data. If
we
have to filter out unrelated data every time we fetch from the table,
it's
a good indicator the table structure isn't right.

Is there any use case where all the log data will be needed without
filtering by type? If not, I think that's a good case for separate
tables.

Also API logs, license logs, and download logs all have the potential to
generate tons of entries. Along with payments, each of these logs types
is
slightly different and may require a different data structure down the
road, if not from day one. Compromising on a more abstracted, global data
structure will result in us relying on meta data for the case where
there's
not a column or where adding one doesn't make sense for all logs types,
which will undo much of the performance benefits we get from going with
custom tables.

Another thought. Conceptually it could be argued that logs and notes are
different things, primarily that one is an automated activity log and the
other is supplied by a human.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
digital-downloads/issues/3841#issuecomment-354133348>,
or mute the thread
ABU2LJAV3Sc8g6hmE8qiVD6CXXRrZup-ks5tEmgagaJpZM4GFG4N>
.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/easydigitaldownloads/easy-digital-downloads/issues/3841#issuecomment-354133644,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABQ2XSBYm5DhkgRnaQuOkKJKSI32h9ZOks5tEmiigaJpZM4GFG4N
.

Notes schema proposed by @JJJ here is good.

If we separate logs and notes, where do status changes fall? Are they a log or a note? They be be triggered automatically or by a store owner.

Status change would be a log. A note is user supplied text, not a log of activity.

I don't want to get too caught up on note vs log. I'm more interested in getting the schema right.

WIP PR: #6264

Just as suggestion

For GamiPress logs I build a library that lets you register custom tables like CPT:
https://github.com/rubengc/CT

It makes posible to register a custom table and get the WordPress CRUD features like the add meta boxes functionallity for edit screen, handler for meta data and hooks similar to WordPress to register list fields

With this library I was able to setup an ecommerce add-on for GamiPress and everything is registered as custom tables (payments, payments items, payments notes, etc) so needs by EDD are covered (i am looking to add support for rest API too)

I was based this on @jjj DB class suggested to be added to WordPress core

Thanks for the suggestion @rubengc, I think we'll be better suited to keeping our existing abstraction layer with EDD_DB for the time being as we've already used it for customers. It's also been used for discounts and is being used in Software Licensing 3.6.

@easydigitaldownloads/core-team I'd just like to get feedback on the way I've implemented things here so far. I've spoken with @mindctrl and this is what made most sense to both of us.

  1. A global wp_edd_logs and wp_edd_logmeta has been introduced.
  2. Separate log tables have been introduced for API requests and file downloads, wp_edd_api_request_logs and wp_edd_file_download_logs, respectively. There will be one more table for gateway errors.
  3. New classes have been introduced for each log type, EDD_API_Request_Log, EDD_File_Download_Log and EDD_Log. EDD_Gateway_Error_Log will be introduced.

wp_edd_logs schema

Field | Type | Null | Key | Default | Extra
-- | -- | -- | -- | -- | --
id | bigint(20) unsigned | NO | PRI | NULL | auto_increment
object_id | bigint(20) unsigned | YES | MUL | NULL |
object_type | varchar(200) | NO |   | |  
type | varchar(30) | NO |   | |  
title | varchar(200) | NO |   | |  
message | longtext | NO |   | |  
date_created | datetime | NO |   | |

The reason for object_type is to specify which table the object_id is stored in, i.e. discount, payment etc. (More explained below).

wp_edd_logmeta schema

Field | Type | Null | Key | Default | Extra
-- | -- | -- | -- | -- | --
meta_id | bigint(20) | NO | PRI | NULL | auto_increment
edd_log_id | bigint(20) | NO | MUL | |  
meta_key | varchar(255) | YES | MUL | NULL |  
meta_value | longtext | YES |   | NULL |  

wp_edd_api_request_logs schema

Field | Type | Null | Key | Default | Extra
-- | -- | -- | -- | -- | --
id | bigint(20) unsigned | NO | PRI | NULL | auto_increment
user_id | bigint(20) unsigned | YES | MUL | NULL |
api_key | varchar(32) | NO |   | |  
token | varchar(32) | NO |   | |  
version | varchar(30) | NO |   | |  
request | longtext | NO |   | |  
error | longtext | NO |   | |  
ip | varchar(100) | NO |   | |  
time | varchar(60) | NO |   | |  
date_created | datetime | NO |   | |

wp_edd_file_download_logs schema

Field | Type | Null | Key | Default | Extra
-- | -- | -- | -- | -- | --
id | bigint(20) unsigned | NO | PRI | NULL | auto_increment
download_id | bigint(20) unsigned | NO | MUL | NULL |
file_id | bigint(20) unsigned | NO | | NULL |
payment_id | bigint(20) unsigned | NO | | NULL |
price_id | bigint(20) unsigned | NO | | NULL |
user_id | bigint(20) unsigned | NO | | NULL |
ip | varchar(100) | NO | | NULL |
date_created | datetime | NO |   | |

The reason for the approach behind multiple tables for each type is because log type has different data associated with it. For example, an API request log stores the API key and token, having these as columns in a single, global logs table doesn't make sense. Neither does the fact that we would store them in the wp_edd_logmeta. If we store additional log data in meta, it's still going to mean around 5-10 rows in the DB across two tables rather than a single log entry.

Is this the direction we'd like to follow with logs?

I like that approach. I didn't think I did at first but after thinking about it more, the unique data set for each makes the unique tables appropriate.

I'd like to recommend anything like...

object_id | bigint(20) unsigned | YES | NULL
-- | -- | -- | --

...become...

object_id | bigint(20) unsigned | NO | 0
-- | -- | -- | --


Reasons

  • It's what WordPress does all over, so it's known and predictable
  • Having a 0 default value helps solidify PHP logic, type casting, etc...

object_type as varchar(200) seems too long. Should that be varchar(20) instead?

I think we should refrain from having edd_ prefixes in database column names.

I.E. change edd_log_id to log_id

If we separate logs and notes, where do status changes fall? Are they a log or a note? They be be triggered automatically or by a store owner.

In theory, CRUD actions around notes should be logged.

  • Logs are related to actions performed by users/customers/system/etc
  • Notes are, basically, comments

If I edit or delete a note from a thing, I'd want that logged.

I think we should refrain from having edd_ prefixes in database column names.

I would agree except in regards to the meta tables as the WP Meta API requires the object type to be used as a prefix to the ID column.

@JJJ Same as #5277, gone ahead and removed the create_table() methods from the global logs table. I've kept the schema for API request logs and file download logs tables as #6277 hasn't introduced those classes yet.

@sunnyratilal THANKS! I’ll review, comment, and/or merge today if all’s well.

This also needs to be tested with EDD Message as it makes use of the EDD_Logging class.

@easydigitaldownloads/core-team This is now ready for full code review.

As per a discussion in Slack, we won't expose public API to allow editing of logs.

Closing out as this has been completed. Any bugs found during alpha/beta testing should be opened as new issues.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

boluda picture boluda  Â·  4Comments

colomet picture colomet  Â·  4Comments

mindctrl picture mindctrl  Â·  4Comments

zackkatz picture zackkatz  Â·  5Comments

davidsherlock picture davidsherlock  Â·  4Comments