Hello - I was wondering if someone could throw together a way to export a .tde file which stands for a Tableau Data Extract file. I spend a lot of time reading and writing .csv files and then extracting the data manually in Tableau but it would be cool to skip the .csv step and read/write directly to the .tde format.
http://onlinehelp.tableau.com/current/api/sdk/en-us/help.htm#SDK/tableau_sdk_using_python.htm%3FTocPath%3D_____4
is tde a proprietary format?
Yes, it is. They do have a module/API for Python 2.7 though called dataextract. Here is a sample of actual code I need to use. I create a file in pandas, export it to .csv, then I read it again with this code, change the column dtypes, and export it as .tde. I know some ETL software like Alteryx offers the ability to directly output .tde files.
import dataextract as tde
loc = 'C:\Users\Documents\'
tdefilename = 'y.tde'
csvname = 'x.csv'
tdefile = tde.Extract(loc+tdefilename)
csv = csv.reader(open(loc+csvname, 'rb'), delimiter=',', quotechar='"')
if tdefile.hasTable('Extract'):
table = tdefile.openTable('Extract')
tabledef= tde.getTableDefinition()
else:
tabledef= tde.TableDefinition()
tabledef.addColumn('Date', tde.Type.DATE)
tabledef.addColumn('ID', tde.Type.CHAR_STRING)
tabledef.addColumn('CreatedDate', tde.Type.DATETIME)
tabledef.addColumn('CreatedBy_Vendor', tde.Type.CHAR_STRING)
tabledef.addColumn('Tier', tde.Type.CHAR_STRING)
tabledef.addColumn('C2', tde.Type.CHAR_STRING)
tabledef.addColumn('C4', tde.Type.CHAR_STRING)
tabledef.addColumn('Total', tde.Type.INTEGER)
table = tdefile.addTable('Extract', tabledef)
newrow = tde.Row(tabledef)
csv.next()
for line in csv:
date = datetime.datetime.strptime(line[0], "%m/%d/%Y")
newrow.setDate(0, date.year, date.month, date.day)
newrow.setCharString(1,line[1])
date = datetime.datetime.strptime(line[2], "%m/%d/%Y %H:%M")
newrow.setDateTime(2, date.year, date.month, date.day, date.hour, date.minute, date.second, date.microsecond)
newrow.setCharString(3,line[3])
newrow.setCharString(4,line[4])
newrow.setCharString(5,line[5])
newrow.setCharString(6,line[6])
newrow.setInteger(7,int(line[7]))
table.insert(newrow)
tdefile.close()
I think integration between TDE and with pandas would be a great idea for an external library. We've started linking to these projects in the pandas docs.
This should be done in an external library and is out-of-scope for pandas (mainly because of the additional dependencies that tde introduces).
What's the process to suggest this for an external module? I don't mind if it is built in directly to pandas. Ideally I could maintain my pandas workflow and then at the end I can import the tde module and save my files in a format that Tableau could read directly without time consuming extracts.
@ldacey you or someone who is interested create a new package on github!
my point above is is too proprietary / complicated to build so not will be included in pandas.
Most helpful comment
I think integration between TDE and with pandas would be a great idea for an external library. We've started linking to these projects in the pandas docs.