How do I make a multi-line string in Python (so I can ctrl+C my SQL code)
August 29th, 2023 in Blog, Jupyter, Machine Learning, Python, Spark, SQL
This should be a quick one, but it is something that took me a little while to work out, but now I know I use it pretty much every day in my projects to save me a lot of time. This is the ability to copy and paste my SQL code into my python code as a multi-line string without having to reduce it to be all on one line or using escape characters. Both of which can take a bit of time depending on how big your SQL is. You usually need to have SQL…
» Read morePySpark Problems: Using Map() gives the error “TypeError: unsupported operand type(s) for /: ‘builtin_function_or_method’ and ‘float’ “
March 31st, 2023 in Azure, Blog, Python, Spark
This error was something I saw at the same time as the error I discussed in my previous blogpost (here), where we are seeing conflicting data types when trying to divide each value of a count of values by the number of days in 3 months (approximately) to get a frequency value over 3 months. I did show the code to fix the error we will discuss in the previous blog post, but I will go into more detail here. The code (without the line that fixes…
» Read morePySpark Problems: Using Map() gives the error “TypeError: unsupported operand type(s) for /: ‘Row’ and ‘float’ “
March 31st, 2023 in Azure, Big Data, Blog, Python, Spark
Here’s the first in what will be an adhoc series of short blog posts where I will write a paragraph on the solution to problems I come across when I’m using PySpark. In this post I will discuss using the Map() function to apply a function to every value in an RDD and then getting the error message: “TypeError: unsupported operand type(s) for /: ‘Row’ and ‘float’ “ We see this error because we are…
» Read moreMy Jupyter Notebook Crashed and Now Won’t Open
February 14th, 2023 in Blog, Jupyter, Python
This is a problem that has plagued me a few times and is usually due to outputting too much in a single cell. The Notebook will run out of memory and crash, but it will contain the output that caused the notebook to crash. The notebook will crash every time you open it, for some browsers you may even see an ‘out of memory’ error message. The solution to this problem is to delete the vast amount of output, without opening the…
» Read morePart 4: Natural Language Processing – Bringing it all together!
January 23rd, 2023 in AI, Azure, Big Data, Blog, Business Intelligence, DAX, Power BI, Python
Here’s the final post in this blog series on natural language processing where we are going to bring everything together and web scrape Trust Pilot for review data, which we will then perform Natural Language Processing on and then display in a Power Bi dashboard. I’ll be talking exclusively practically in this demo, so for a refresher on the theory please refer back to my earlier blogposts (Part 1, Part 2, Part 3). To re-iterate the goal…
» Read morePurple Frog at AI and Big Data Expo
December 15th, 2022 in AI, Big Data, Blog
During the 1st and 2nd of December, Purple Frog descended on London Olympia for 2 days of AI, ML and Big Data based fun at the AI & Big Data Expo. The AI and Big Data Expo is a leading Artificial Intelligence & Big Data Conference & Exhibition that showcases the next generation enterprise technologies and strategies from the world of Artificial Intelligence & Big Data, providing an opportunity to explore and discover the…
» Read morePart 3: Natural Language Processing – Sentiment Analysis and Opinion Mining
November 24th, 2022 in AI, Azure, Big Data, Blog, Business Intelligence, Python, Synapse
If you remember in part 2 we discussed what Key Word Analysis is and how this can be implemented to gain deeper insight from textual data. But we can go one step deeper and extract feelings and opinions from the same data. We can do this through Sentiment Analysis and Opinion mining! In this blog I will talk you through what they are and how we can implement them using Microsoft’s Cognitive Services. What is Sentiment Analysis? We should…
» Read morePart 2 : Natural Language Processing- Key Word Analysis
August 10th, 2022 in AI, Azure, Big Data, Blog, Machine Learning, Python, Uncategorized
Here we are with part 2 of this blog series on web scraping and natural language processing (NLP). In the first part I discussed what web scraping was, why it’s done and how it can be done. In this part I will give you details on what NLP is at a high level, and then go into detail of an application of NLP called key word analysis (KWA). What is NLP? NLP is a form of artificial intelligence which deals with the interactions between humans…
» Read morePart 1: Web Scraping and Natural Language Processing- Web Scraping
July 18th, 2022 in AI, Blog, Machine Learning, Python
In this multi blog series I will go through what web scraping is, what Natural Language processing is as a general term as well as diving into some constituent techniques we are interested in; key word extraction, sentiment analysis and its derivative opinion mining. The last few parts will then go through a coded example of scraping the popular review site Trust pilot for reviews of the popular supermarket chain ‘Lidl’. We will then…
» Read moreDeploy Azure AutoML Model And Consume In Excel
May 31st, 2022 in Blog, Machine Learning, Uncategorized
It has come back to my turn to write a blog post, and if you remember my previous one concerned why you should use Azure based AutoMl and subsequently how to do so. If you followed that then you will be left with a model of which you’ve scored and know the performance of, but no way of how to then deploy and use your model. I will outline the steps needed to do this (which involves a major shortcut as we are using an AutoMl model), and…
» Read more