In this blog post, I will run through 3 different ways to execute a delayed Python loop. In these examples, we will aim to run the loop once every minute.
To show the delay, we will print out the current datetime using the datetime module.
1 – Sleep
The sleep function from Python’s time module pauses the Python execution by the number of seconds inputted. The example below pauses the script for 60 seconds.
The above script has a 60 second delay between the end of a loop and the start of the next. However, what if we wanted to execute the next loop 60 seconds after the start of the previous loop.
In other words, how do we start the loop at the same time every minute.
2 – IF Statement
To do this we will switch to a WHILE loop and use an IF statement to see how many seconds have passed in the current minute. A WHILE loop is used because it will execute until i = 10, until the datetime has been printed 10 times. The IF statement checks the time to see when the print statement should be executed.
We will print the datetime every time the minutes changes. The Sleep function will continue to be used as the loop would execute multiple times in one second if we didn’t.
date_time[-2:] – is the last two characters of our datetime variable which is the number of seconds that have passed this minute.
3 – Task Scheduler
The previous two options are good for executing a loop a few times, ten in our case. If we wanted to execute a python script continuously without expiring, we could use the above examples with an infinite appending loop.
However, if one loop errors the script will stop. Therefore, we want to execute the entire Python script once a minute using an external trigger. This is where we can use Task Scheduler.
Task Scheduler can execute a python script from source but it is often easier to use a batch file. The batch file includes the location of the python application (python.exe) and the location of the python script (.py). For more detail on using Task Scheduler and batch files to run Python scripts, please see the following datatofish post – https://datatofish.com/python-script-windows-scheduler
Our batch file is:
To demonstrate Task Scheduler, I’m going to run the following Python code every minute.
This code uses Pandas to produce a blank CSV file, but the name of CSV file is the datetime the script was run.
These following screenshots show the triggers and actions used.
This produced the following CSV files, we can see that the files takes 1 – 4 seconds to create.
In summary we have seen three different ways to delay a Python loop, two using loops inside Python and one using Task Scheduler. All can be used depending on what kind of delay is best.
A common request that is raised by clients is how to sort a table in Power BI by multiple columns, in the same way you can in Excel.
For a while, there was no way (at least no easy way) to do this until the Power BI March 2020 update.
I learnt this tip from the following YouTube video:
Full credit to Dhruvin Shah, check his video out.
Below I have a Power BI table displaying fruit sales, currently unsorted.
To sort the table by Fruit click on the column header Fruit.
The table is now sorted by Fruit in alphabetical order.
To add a secondary sort on Sales, hold the Shift key and click on the column header Sales.
The table is now sorted by
– Fruit name in alphabetical order
– Sales in descending order
Some extras to note:
– There is no limit on the number of columns that can be used to sort a table. Just hold the shift key and keep choosing columns.
– This feature is not available for matrices.
– To switch the sorting from ascending to descending or vice-versa continue to hold shift and click on the column header again.
This post shows hows how you can capture and store the number of records inserted, updated or deleted from a T-SQL Merge statement.
This is in response to a question on an earlier post about using Merge to load SCDs in a Data Warehouse.
You can achieve this by using the OUTPUT clause of a merge statement, including the $Action column that OUTPUT returns.
The basic syntax is:
INSERT INTO XXXX SELECT [Action] FROM ( MERGE XXXX AS Target USING XXXX AS Source ON XXXX=XXXX WHEN MATCHED AND XXXX <> XXXX THEN UPDATE SET XXXX=XXXX WHEN NOT MATCHED THEN INSERT ( XXXX ) VALUES ( XXXX ) OUTPUT $action AS [Action] ) MergeOutput
You wrap the Merge statement up as a sub-query, adding the OUTPUT clause to return details about what happened in the merge. Note that you can’t just select from this sub-query, there has to be an INSERT INTO statement.
One row will be returned for each row touched by the merge process.
The $action column will contain either INSERT, UPDATE or DELETE, to indicate what happened to that row.
You can also include Source.* in order to include the source column values in the output dataset.
You can also include DELETED.*, which returns the values of any updated records before they were updated, and INSERTED.* to show the values after the updated. In reality the records are not deleted or inserted, just updated, but DELETED/INSERTED is used as the terminology for old/new values either side of the update. When inserting a new record, DELETED values will be NULL.
... OUTPUT $action AS [Action], Source.*, DELETED.*, INSERTED.* ) MergeOutput
You can then refer to this ‘MergeOutput’ result set at the top of the query by selecting from this sub-query.
There is a limitation, though, you can’t aggregate the table. So if we want to summarise the actions into a single row of insert, update and delete counts, we have to use a temporary table such as in the sample code below.
CREATE TABLE #MergeActions ([Action] VARCHAR(10)) INSERT INTO #MergeActions ([Action]) SELECT [Action] FROM ( MERGE [dim].[Date] AS Target USING [etl].[DimDate] AS Source ON ISNULL(Target.[DateKey],0) = ISNULL(Source.[DateKey],0) WHEN MATCHED AND (Target.[Date] <> Source.[Date] OR Target.[Month] <> Source.[Month] OR Target.[Quarter] <> Source.[Quarter] OR Target.[Year] <> Source.[Year] ) THEN UPDATE SET [Date] = Source.[Date] ,[Month] = Source.[Month] ,[Quarter] = Source.[Quarter] ,[Year] = Source.[Year] ,LastUpdated = GetDate() WHEN NOT MATCHED THEN INSERT ( [DateKey] ,[Date] ,[Month] ,[Quarter] ,[Year] ,LastUpdated ) VALUES ( Source.[DateKey] ,Source.[Date] ,Source.[Month] ,Source.[Quarter] ,Source.[Year] ,GetDate() ) OUTPUT $action AS [Action] ) MergeOutput ; SELECT SUM(CASE WHEN [Action]='INSERT' THEN 1 ELSE 0 END) AS InsertCount ,SUM(CASE WHEN[Action]='UPDATE' THEN 1 ELSE 0 END) AS UpdateCount ,SUM(CASE WHEN [Action]='DELETE' THEN 1 ELSE 0 END) AS DeleteCounts FROM #MergeActions GROUP BY [Action] DROP TABLE #MergeActions
We all know and love Power BI, because of it’s power and ease of use, there’s a good reason why it’s number 1 in the Gartner Quadrant. But how to we avoid ‘Power BI Hell’, the new version of our old nemesis ‘Excel Hell’?
How do we keep track of which data sources are being used by which datasets and reports? How to we perform an impact assessment of updating/changing a dataset? What happens if someone deletes a Power BI report? There has been no easy solution to this, until now.
Step forward a new SaaS tool, Power BI Sentinel
Sentinel is a new Software as a Service (i.e. hosted in the cloud) tool which sits alongside your PowerBI.com account, helping you manage things better. Core features include:
- Data Lineage documentation – visibility of which reports are using which datasets, and which data sources. Plus showing which downstream datasets and reports will be impacted by deleting or changing a data source or dataset
- Automated Backups – backing up your PBIX files directly to your own Azure storage. Allowing you to roll back, recover deleted files, and see what the report/data looked like in the past for debug/investigation
- Automated Dataset Documentation – Generate online or PDF documentation of your Power BI datasets, the model, tables, DAX measures, calculated columns, etc.
- Report Change Tracking – If someone’s broken a report by adding/removing a filter, changing a visual, etc. the change tracking will show you what has changed, and when, allowing you to roll back to the correct version prior to the change
The diagram above shows a sample Data Lineage diagram of a Power BI Dataset, showing which data sources (databases, files, websites, etc.) are used by the dataset, and which reports then consume this dataset.
Scenario 1: Power BI Governance, Compliance, GDPR
You have a data source, lets say an Oracle database, containing sensitive customer details (PII). When auditing your organisation’s usage of this data, which of your 600 Power BI reports do you need to review to assess whether that PII data is exposed or not, and to who?
The Power BI Data Lineage explorer allows you to select a datasource, and immediately see which datasets and reports consume data from that source, including those hidden away in users’ personal ‘My Workspace’ folders. Straight away refining the list of sensitive reports to a handful.
Scenario 2: Power BI Disaster Recovery
Your well behaved development teams obviously follow good DevOps practice for all their development work (hopefully!), but Power BI is a self service tool, allowing your business users to create their own reports. And you’re almost certainly not going to get business users using source control and good testing practices on their reports.
Self service is a very powerful feature which significantly enhances the benefit of Power BI to your organisation. But if ‘Mary from Finance’ creates a report that becomes business critical, if that report gets corrupted, broken, deleted or changed, there is no recovery or rollback to the previous version of that report.
Power BI Sentinel not only gives you visibility of whether these critical reports are hidden in personal workspaces, but also performs regular automated backups of the reports, allowing you to recover and roll back if required.
Power BI is a very powerful tool, but with great power comes great responsibility.
Power BI Sentinel is a great way of providing transparency over your Power BI tenant, and protecting you if anything should ever go wrong.
Sign up for a licence at PowerBISentinel.com
I think it’s about time for a good old rant. Nothing to do with Business Intelligence, but…
Anyone out there who uses Sky Anytime (and I believe the BBC iPlayer) will have probably noticed that they use a nasty peer-to-peer file sharing application called Kontiki, which they install automatically without giving you any control over. My problem is that I have a good 8Mb download speed but my upload is still a rather paltry 256K (I know, it’s almost embarrasing). I use Sky Anytime to download a film probably no more than once a month, the rest of the time Kontiki sits there on my PC hogging my precious upload bandwith. They don’t give you any control over restrict it or turning it off – hense my rant.
I’m happy to leave it running for a bit when it’s not causing a problem, but I want to easily be able to turn it off. So, the solution? Create a Kontiki-Killer batch file that disables it. When you next launch Sky Anytime it will automatically load again, but when you’re finished, you can run this batch file to kill the little blighter.
It kills the KHost.exe process, stops the KService service, sets the KService to start manually not automatically (so it will only start when you launch Sky Anytime), and removes KHost.exe from the startup options in the registry (Sky will add them back in when it is next launched).
I use this on my XP Pro SP2 box, I’ve not tried it on any other OS, so I can not accept any responsibility for any problems is may cause – use entirely at your own risk.
Copy this into a blank text file, and save it as KillKontiki.bat
@Echo off Echo. Echo. Echo Kontiki-Killer... Echo. - In progress... Echo. Echo Killing KHost.exe process... taskkill /IM KHost.exe /T /f Echo. Echo Stopping KService service... NET Stop KService Echo. Echo. Making the KService service on demand not on startup sc config KService start= demand Echo. Echo Removing from Registry startup.... REG DELETE HKLM\Software\Microsoft\Windows\CurrentVersion\Run /v kdx /f REG DELETE HKCU\Software\Microsoft\Windows\CurrentVersion\Run /v kdx /f Echo. Echo. Echo Kontiki blitz complete. Echo. Echo. (C) Purple Frog Systems Echo. pause
You can also download it here
Well, what can I say – the very first Frog-Blog post.
Keep checking back often for new and (hopefully) useful info…