Let me start by saying that I think the SSAS Tabular model is great. But…. there are a number of problems that Microsoft still need to get ironed out.
Not least of which is being able to import data properly directly from CSV/Text files. Yes you can import directly from csv, but you are given absolutely no control over the process, and this can lead to some serious problems.
One of these issues is the rather odd automated data type selection that is used by the Tabular import process. A column consisting of a combination of alpha and numerical text is more often than not treated as an integer, with all text information stripped out. And the Tabular model designer provides absolutely no way of changing this behaviour.
For example, take the following csv file, containing just three columns; ID, ItemCode and ItemValue.
The second column should clearly be interpreted as text, as the 5th row contains a non-numerical value. However when this is imported into the Tabular model, it treats the column as an integer. As you can see from the screenshot below, because the 5th row doesn’t contain a valid integer, the value is just ignored.
One would expect that we could simply go into the table in the designer, and update the Data Type property for the column. No. This just takes the numerical value and formats it as text. But any non-numerical values are still stripped out. The problem is that when the Tabular model reloads the file, it detects that the data type is an integer, and there’s nothing we can do to override it.
The only way of getting around this is by wrapping the strings in quotes within our csv.
This is irritating but we really don’t have a choice, just remember to be explicit in any csv definition that is to be imported directly into a Tabular model.
However, what if we find this out too late? If we’ve already built the model, added all of our DAX calculations in, set up the relationships etc., how do we change the format of an already created column?
Firstly we have to get the csv updated to wrap every string column in quotes.
If we try and just reimport this we’ll get an error “Unable to convert a value to the data type requested for table ‘xxx’ column ‘xxx’. The current operation was cancelled because another operation in the transaction failed.”
So first you have to change the column data type for the table in the designer. Click on the column, then in the column properties, change Data Type to Text.
Once this is done, you can reprocess the table and import the actual text.
This is all well and good, and works most of the time. However, I recently encountered a Tabular model which had this problem, and the above process wouldn’t work. So the only solution I found was to go routing in the Tabular model’s xml source code, and force it to change. Hacking it manually worked a treat, so I thought I’d share the process here. Just be careful – always keep a backup of your files before you change anything!
To do this, open up your .bim file in a suitable text editor. I highly recommend Notepad++, as it works great for XML.
We need to change three things:
1) Change the data type for the table column key & name, within the dimensions
<Alter><ObjectDefinition><Database><Dimensions><Dimension><Attributes><Attribute><KeyColumns><KeyColumn>
change <DataType>BigInt</DataType> to <DataType>WChar</DataType> (Note this may be Int or BigInt)
change <DataSize>-1</DataSize> to <DataSize>32768</DataSize>
Then do the same for the <NameColumn>
2) Change the data type for the table column key & name in the corresponding cube
<Alter><ObjectDefinition><Database><Cubes><Cube><MeasureGroups><MeasureGroup><Dimensions><Dimension><Attributes><Attribute><KeyColumns><KeyColumn>
change <DataType>BigInt</DataType> to <DataType>WChar</DataType> (Note this may be Int or BigInt)
change <DataSize>-1</DataSize> to <DataSize>32768</DataSize>
Then do the same for the <NameColumn>
3) Change the definition of the csv datasource
<Alter><ObjectDefinition><Database><DataSourceViews><DataSourceView><Schema><xs:schema><xs:element><xs:complexType><xs:choice><xs:element><xs:complexType><xs:sequence><xs:element>
delete ‘type=”xs:int”‘ from the element, and replace it with a SimpleType and restriction defining the string:
<xs:simpleType><xs:restriction base=”xs:string”><xs:maxLength value=”32768″ /></xs:restriction></xs:simpleType>
Then save the .bim file, reload your Tabular model, and reprocess the table. Problem solved.
Frog-Blog-Out