Python

  • PySpark Problems: Using Map() gives the error “TypeError: unsupported operand type(s) for /: ‘builtin_function_or_method’ and ‘float’ “

    This error was something I saw at the same time as the error I discussed in my previous blogpost (here), where we are seeing conflicting data types when trying to divide each value of a count of values by the number of days in 3 months (approximately) to get a frequency value over 3 months. I did show the code to fix the error we will discuss in the previous blog post, but I will go into more detail here. The code (without the line that fixes…

    » Read more
  • PySpark Problems: Using Map() gives the error “TypeError: unsupported operand type(s) for /: ‘Row’ and ‘float’ “

    Here’s the first in what will be an adhoc series of short blog posts where I will write a paragraph on the solution to problems I come across when I’m using PySpark. In this post I will discuss using the Map() function to apply a function to every value in an RDD and then getting the error message: “TypeError: unsupported operand type(s) for /: ‘Row’ and ‘float’ “ We see this error because we are…

    » Read more
  • How to add text to images using Python

    In this blog post, I will be showing how to add text to an image in Python. I used this code to generate hundreds of images with different text for one of our clients. The full script is downloadable at the bottom with the required extras. Firstly, install the Python library Pillow. Pillow is a PIL fork, install Pillow using pip and import PIL into the Python script.           In Python, select the image that you would…

    » Read more
  • My Jupyter Notebook Crashed and Now Won’t Open

    This is a problem that has plagued me a few times and is usually due to outputting too much in a single cell. The Notebook will run out of memory and crash, but it will contain the output that caused the notebook to crash. The notebook will crash every time you open it, for some browsers you may even see an ‘out of memory’ error message. The solution to this problem is to delete the vast amount of output, without opening the…

    » Read more
  • Part 4: Natural Language Processing – Bringing it all together!

    Here’s the final post in this blog series on natural language processing where we are going to bring everything together and web scrape Trust Pilot for review data, which we will then perform Natural Language Processing on and then display in a Power Bi dashboard. I’ll be talking exclusively practically in this demo, so for a refresher on the theory please refer back to my earlier blogposts (Part 1, Part 2, Part 3). To re-iterate the goal…

    » Read more
  • Part 3: Natural Language Processing – Sentiment Analysis and Opinion Mining

    If you remember in part 2 we discussed what Key Word Analysis is and how this can be implemented to gain deeper insight from textual data. But we can go one step deeper and extract feelings and opinions from the same data. We can do this through Sentiment Analysis and Opinion mining! In this blog I will talk you through what they are and how we can implement them using Microsoft’s Cognitive Services. What is Sentiment Analysis? We should…

    » Read more
  • Part 2 : Natural Language Processing- Key Word Analysis

    Here we are with part 2 of this blog series on web scraping and natural language processing (NLP). In the first part I discussed what web scraping was, why it’s done and how it can be done. In this part I will give you details on what NLP is at a high level, and then go into detail of an application of NLP called key word analysis (KWA). What is NLP? NLP is a form of artificial intelligence which deals with the interactions between humans…

    » Read more
  • Part 1: Web Scraping and Natural Language Processing- Web Scraping

    In this multi blog series I will go through what web scraping is, what Natural Language processing is as a general term as well as diving into some constituent techniques we are interested in; key word extraction, sentiment analysis and its derivative opinion mining. The last few parts will then go through a coded example of scraping the popular review site Trust pilot for reviews of the popular supermarket chain ‘Lidl’. We will then…

    » Read more
  • Creating a quadratic solver in Python

    In this blog post, I will be showing you how to create a python function that will solve any quadratic equation. Firstly, what is a quadratic equation? A quadratic is an algebraic equation that can be arranged as ax2 + bx + c = 0 where a, b, c are known and a ≠ 0. If a = 0 then the equation becomes linear as there isn’t a x2 term. Secondly, how to solve this equation? There are different ways to solves quadratic equations including: By…

    » Read more
  • Pandas; Why Use It And How To Do So!

    Introduction Hi and welcome to what will be my first frog blog! I’m Lewis Prince, a new addition to the Purple Frog team who has come on board as a Machine Learning Developer. My skill set resides mainly in Data Science and Statistics, and using Python and R to apply these. Therefore my blogs will be primarily on hints and tips on performing Data Science and Statistics through the medium of Python (and possibly R). I thought I would start…

    » Read more