0845 643 64 63

csv

SSAS Tabular String Imported as Integer

Let me start by saying that I think the SSAS Tabular model is great. But…. there are a number of problems that Microsoft still need to get ironed out.

Not least of which is being able to import data properly directly from CSV/Text files. Yes you can import directly from csv, but you are given absolutely no control over the process, and this can lead to some serious problems.

One of these issues is the rather odd automated data type selection that is used by the Tabular import process. A column consisting of a combination of alpha and numerical text is more often than not treated as an integer, with all text information stripped out. And the Tabular model designer provides absolutely no way of changing this behaviour.

For example, take the following csv file, containing just three columns; ID, ItemCode and ItemValue.

ChangeTabularDataType01

The second column should clearly be interpreted as text, as the 5th row contains a non-numerical value. However when this is imported into the Tabular model, it treats the column as an integer. As you can see from the screenshot below, because the 5th row doesn’t contain a valid integer, the value is just ignored.

ChangeTabularDataType02

One would expect that we could simply go into the table in the designer, and update the Data Type property for the column. No. This just takes the numerical value and formats it as text. But any non-numerical values are still stripped out. The problem is that when the Tabular model reloads the file, it detects that the data type is an integer, and there’s nothing we can do to override it.

The only way of getting around this is by wrapping the strings in quotes within our csv.

ChangeTabularDataType03

This is irritating but we really don’t have a choice, just remember to be explicit in any csv definition that is to be imported directly into a Tabular model.

However, what if we find this out too late? If we’ve already built the model, added all of our DAX calculations in, set up the relationships etc., how do we change the format of an already created column?

Firstly we have to get the csv updated to wrap every string column in quotes.

If we try and just reimport this we’ll get an error “Unable to convert a value to the data type requested for table ‘xxx’ column ‘xxx’. The current operation was cancelled because another operation in the transaction failed.”

ChangeTabularDataType04

So first you have to change the column data type for the table in the designer. Click on the column, then in the column properties, change Data Type to Text.

ChangeTabularDataType05Once this is done, you can reprocess the table and import the actual text.

This is all well and good, and works most of the time. However, I recently encountered a Tabular model which had this problem, and the above process wouldn’t work. So the only solution I found was to go routing in the Tabular model’s xml source code, and force it to change. Hacking it manually worked a treat, so I thought I’d share the process here. Just be careful – always keep a backup of your files before you change anything!

To do this, open up your .bim file in a suitable text editor. I highly recommend Notepad++, as it works great for XML.

We need to change three things:

1) Change the data type for the table column key & name, within the dimensions

<Alter><ObjectDefinition><Database><Dimensions><Dimension><Attributes><Attribute><KeyColumns><KeyColumn>
change <DataType>BigInt</DataType> to <DataType>WChar</DataType>     (Note this may be Int or BigInt)
change <DataSize>-1</DataSize> to <DataSize>32768</DataSize>

Then do the same for the <NameColumn>

ChangeTabularDataType06

2) Change the data type for the table column key & name in the corresponding cube

<Alter><ObjectDefinition><Database><Cubes><Cube><MeasureGroups><MeasureGroup><Dimensions><Dimension><Attributes><Attribute><KeyColumns><KeyColumn>
change <DataType>BigInt</DataType> to <DataType>WChar</DataType>     (Note this may be Int or BigInt)
change <DataSize>-1</DataSize> to <DataSize>32768</DataSize>

Then do the same for the <NameColumn>

3) Change the definition of the csv datasource

<Alter><ObjectDefinition><Database><DataSourceViews><DataSourceView><Schema><xs:schema><xs:element><xs:complexType><xs:choice><xs:element><xs:complexType><xs:sequence><xs:element>
delete ‘type=”xs:int”‘ from the element, and replace it with a SimpleType and restriction defining the string:

<xs:simpleType><xs:restriction base=”xs:string”><xs:maxLength value=”32768″ /></xs:restriction></xs:simpleType>ChangeTabularDataType07

 

Then save the .bim file, reload your Tabular model, and reprocess the table. Problem solved.

Frog-Blog-Out

 

 

Excel doesn’t open CSV files correctly from sp_send_dbmail

A nice little nugget today for anyone who uses SQL dbmail to send data in csv attachments.

Sometimes, depending on your language settings, CSV files don’t open correctly in Excel. Excel ignores the ‘,’ and puts all data in column A. You then have to go through the hassle of closing the file and opening it throug the flat file wizard.

 

excelcsv1

There’s a very nice solution to this that I found in this thread. It’s to insert a new row in the top of the file containing “sep=,”. This forces Excel to understand that it is a comma delimited file, and ensures that it will open correctly.

The only problem, how do you insert a header line when using dbmail?…

[Caution – ugly hack alert…]

You can alter the name of the first column to include this header text. We simply rename “Column1” to “sep=,{CR}{LF}Column1”. Then when dbmail prints out the column headers in the file, the Column1 name will be split on two lines, preceeded by “sep=,”.

excelcsv2

Excel treats this first row as an instruction and does not display it, just uses it to make sure it formats the data correctly.

So how do we do this in SQL?

DECLARE @Column1Name VARCHAR(255)
DECLARE @Query VARCHAR(2048)
SET @Column1Name = '[sep=,' + CHAR(13) + CHAR(10) + 'Column1]'
SET @Query = 'SELECT Column1 AS ' + @Column1Name + ', Column2, Column3 FROM myTable'

EXEC msdb.dbo.sp_send_dbmail
@profile_name=XXX
,[Other parameters as required]
,@query=@Query
,@attach_query_result_as_file=1
,@query_attachment_filename='QueryResults.csv'
,@query_result_separator=',' --enforce csv
,@query_result_no_padding=1 --trim
,@query_result_width=32767  --stop wordwrap

And there you go. It’s a bit of a hack, but works well.

Frog-Blog-Out

The Frog Blog

I'm Alex Whittles.

I specialise in designing and implementing SQL Server business intelligence solutions, and this is my blog! Just a collection of thoughts, techniques and ramblings on SQL Server, Cubes, Data Warehouses, MDX, DAX and whatever else comes to mind.

Data Platform MVP

Frog Blog Out
twitter
rssicon