Blog

  • Why Are There Gaps in the Identity Column in My SQL Database?

    All too often there are questions asked about missing records that are identified by gaps in the identity column of a table. If a table, ordered by the identity column, shows “1, 2, 3, 5”, record 4 must have been deleted right? Well, although that is a possibility, it’s not a safe assumption as there are other causes of these “missing” IDs. Let’s start with a simple table, it only has 3 columns, the first of which being an…

    » Read more
  • #This vs @That – Should I use a Table Variable or a Temporary Table?

    When working with SQL Server, it’s not uncommon to need to store data in a temporary table or table variable. While both options can be used to accomplish the same goal, there are differences between the two that can affect performance and the ability to write efficient code. Let’s explore the differences between table variables and temporary tables, and when to use each one. @Table Variables Table variables are declared using the…

    » Read more
  • PySpark Problems: Using Map() gives the error “TypeError: unsupported operand type(s) for /: ‘builtin_function_or_method’ and ‘float’ “

    This error was something I saw at the same time as the error I discussed in my previous blogpost (here), where we are seeing conflicting data types when trying to divide each value of a count of values by the number of days in 3 months (approximately) to get a frequency value over 3 months. I did show the code to fix the error we will discuss in the previous blog post, but I will go into more detail here. The code (without the line that fixes…

    » Read more
  • PySpark Problems: Using Map() gives the error “TypeError: unsupported operand type(s) for /: ‘Row’ and ‘float’ “

    Here’s the first in what will be an adhoc series of short blog posts where I will write a paragraph on the solution to problems I come across when I’m using PySpark. In this post I will discuss using the Map() function to apply a function to every value in an RDD and then getting the error message: “TypeError: unsupported operand type(s) for /: ‘Row’ and ‘float’ “ We see this error because we are…

    » Read more
  • Why should you invest in your team and go to SQLBits?

    A week or two back the entire Purple Frog and Power BI Sentinel team traveled over to Wales to attend SQLBits, which if you didn’t know, is the world’s largest data platform event! This year we had the privilege to go SQLBits wearing two hats; delegate and sponsor – go big or go home, right? I thought I’d share some insights into what I, and the team, thought about spending a week at the ICC Wales with overwhelming amounts of training to…

    » Read more
  • How to add text to images using Python

    In this blog post, I will be showing how to add text to an image in Python. I used this code to generate hundreds of images with different text for one of our clients. The full script is downloadable at the bottom with the required extras. Firstly, install the Python library Pillow. Pillow is a PIL fork, install Pillow using pip and import PIL into the Python script.           In Python, select the image that you would…

    » Read more
  • Power BI – Visual to show filters that have been applied to a report

      Most PowerBi reports will use slicers to allow the user to filter and interact with the visuals they have. But in practice, users may forget what slicers have been applied. It’s inconvenient to have to look through each slicer, potentially across multiple pages, to see what filters are being used on the report. Therefore, it is not user friendly. We need to show every filter that has been applied in one convenient place. Thankfully,…

    » Read more
  • Excel, CSV and Leading Zeros

    Yes, even at Purple Frog Systems we use Excel! Excel is great for that quick and dirty analysis, but there is nothing worse than opening a CSV file directly with Excel and finding out you’ve lost all of your leading zeros. Take a look at my Sales.csv file Viewed as a CSV in Notepad++ it looks as it should… However viewed directly using Excel… It now looks like I’ve got a sales quantity of 62 for product id 5! To solve…

    » Read more
  • Linked Server with Unique Identifier Issue

    There is currently an issue with Linked Servers when querying and filtering by a guid (uniqueidentifier). If you query a Synapse SQL pool from your on-prem SQL server, for example, and attempt to filter the results using a uniqueidentifier column, you will get the following error message: Msg 103010, Level 16, State 1, Line 1 Parse error at line: 1, column: 408: Incorrect syntax near ‘guid’. Msg 110811, Level 16, State 1, Line 1…

    » Read more
  • My Jupyter Notebook Crashed and Now Won’t Open

    This is a problem that has plagued me a few times and is usually due to outputting too much in a single cell. The Notebook will run out of memory and crash, but it will contain the output that caused the notebook to crash. The notebook will crash every time you open it, for some browsers you may even see an ‘out of memory’ error message. The solution to this problem is to delete the vast amount of output, without opening the…

    » Read more