Scripting download of file from box.com link

Hi everyone,

I’m a bit of a noob at high level programming and web stuff, I’ve only really used vb.net in visual studio (we learned VB6 in school).

I need to reliably download this file once every day using some sort of script on windows. https://cbaa.app.box.com/s/yemq5chyijnstu2foyzv2xucist56riw

If I just use the full link that the browser gets when I click the download button

e.g.

https://public.boxcloud.com/api/2.0/files/52633519149/content?preview=true&version=448269628762&access_token=1!E0ySP34v3XRU2Bnq_2z4GkYanErlwlR3fGmXyXhOp6EZonZrayju6boWYzdMFEumUuThORTqun0raAdKBXzVAzAyfqhyG9HkTAJEXE_8A9L6cPV7hXyIDk9Ujx37lUfs2ijO_DpA_RI-vPqYbcB_I4Sgk5pv6C9SeI1C5lmmrHoOPWlytcTDc7vPHH9cyD5PNMBWmSjpC8gSulfIYb9gRMT5CLArfJzRglu-USxm_OA3KQDwSt2NYPThnhU-CyljGntYUgLpMGcnZ55OxZUGAvLmLaSMeSqcC0VpOr2ZeMwZMuwILM1qOacE14GBHHhh0W8GS47lfFIbKkUpqs0A3ub6w_SfuIr2kWleYdQO9qwQ7AQP68eTnJpW6OfUQ2RB9y4TmVoOTkB_PADThEKY3T8itXCPeTic7B_Uk0YEWQ9DHRg-AL4BSpbDpXN324T9T2-cOqXpaJ8K05dOS1sdc0LNF19XYWzOtg5cWPBTFfk9ov8Q1LjZV4NkFgKOSeFgLvZgpIwXlnmOVuAXXjbpDgC9hPv7NBcSK8t-mSHTB5l5AC3P1xXfgOFUr8lk&shared_link=https%3A%2F%2Fcbaa.app.box.com%2Fs%2Fyemq5chyijnstu2foyzv2xucist56riw&box_client_name=box-content-preview&box_client_version=2.2.1

it times out after a few tries.

Is there a way to automatically generate this “expanded” link?

Is this a really simple task? If so what should I search for for more info?

Thanks, Timothy.

I’d use Python to make a small script that could do it for me. There are libraries available to interact with websites.

I’d think you’ll have to authenticate yourself on the website to get the link, since you’re using an access_token. Maybe if you keep the connection alive before there’s a timeout for your session, but then you might as well authenticate yourself using your script.

Look for tutorials for beautiful soup and requests, can also search for web scraping. It’s reasonably straightforward to get started with, there are some decent tutorials on the matter.

Can recommend PyCharm from intellij.com as an IDE for python, there is a 30 day trial on their website, and it cost very little if you want to continue using it. Python is a very powerful language to use, not only because of the amount of libraries available but also because of what you can do with it in general, on all platforms.

One thing to remember, there are subtle differences in python 2.7 and 3+, so make sure you are using the same as your guide shows. It can be a pain to find the missing : or ().

Finally, if you enjoy programming, learn Python, it’s versatile and has many applications.

Good luck

Edit: Just checked the website you linked. Don’t think authentication is needed. If you do view-source, search for “<audio” you can pull the link you want by extracting the “src” property. This is easily achievable with for instance beautiful soup.

I know this forum favors open source applications but since you are on windows I think a scheduled task running powershell might suit your needs with minimal installs/modifications. The task scheduler portion is the easy part so lets determine if powershell can do it before we set anything up.

Disclaimer: I’m only OK at Powershell - I won’t get defensive if anyone chimes in with a better way to do this because I’m sure there are many.

The website you listed prompts a download button.
Taking a peek via chrome devtools shows a button class.
image

In .net there is are command to open an an IE window and work with it through powershell and other .net applications.

#Create an IE Object
$ie = New-Object -com InternetExplorer.Application

#Allow it to open an IE Window. 
$ie.visible = $true

#Open the website in IE
$ie.navigate("https://cbaa.app.box.com/s/yemq5chyijnstu2foyzv2xucist56riw")

These commands should open an IE window.

image

The variable $ie now contains the live Internet Exporer Window an object.
Further interactions can be performed with the variable in this powershell session.

You can output the object directly to see what it contains.

$IE

image

I wasn’t sure what look for so I expanded the properties until I found the page info in document.

If we try a standard select command then it just displays the type data.
$ie | select document
image

We want to expand this and treat it as its own object.
The easiest way I know to do this is to use a parenthesis.
($ie).document
image

This gives us a ton of properties but did not allow me to filter by the"button class we found earlier in chrome. I googled a bit and narrowed what we want down to the .getElementsByTagName() function

$ie.Document.getElementsByTagName("Button")

image

This command dumped less data out but button tag properties are still way to much to sort through. We need a way to narrow this down so we can find the right button.

Looking back at the Chrome Devtools we can see a data-resin-target property that contained “Download”.

image

I tweaked with selecting properties until I found a way to output all the downloads.
Poking through the properties I determined that the Outerhtml property seems to contain a dump of related html. We should be able to output that by piping to select-object.

$ie.Document.getElementsByTagName("Button") | select outerhtml

image

The underlined Download portion looks to be what we want so lets drop a where-object statement in the pipeline to filter this madness down a bit.

$ie.Document.getElementsByTagName("Button") | where{$_.Outerhtml -like "*Download*"} | select outerhtml
image
We are down 3 Responses now!
Download File looks like the best choice so lets adjust our where wildcard filter to that.

$ie.Document.getElementsByTagName("Button") | where { $_.outerhtml -like "*Download File*" } | select outerhtml

Bingo.
We are down to 1 result.

Powershell doesn’t care about what is in the object and will need to entire object to download the file so lets take out the select filter. With that removed we can the object pipe that into a .click() command that I found during my googling time. Some posts put the whole element grab into a variable but I’m just going to toss our filter into a parenthesis and put the .click() after it.
($ie.Document.getElementsByTagName("Button") | where { $_.outerhtml -like "*Download File*" }).click()
This command will execute on the IE window.
image
Partial Success! We now have a download prompt but it requires a manual click.

I need to walk away for a bit, but this may be enough to start tinkering on your own. I should have more time to look a this tonight.

The remaining tasks left are:

  • To detemine a method of clicking save
  • Tweak it so that it works without opening a window
  • Setting up a scheduled task to run it.

Powershell Script so far:

#https://forum.level1techs.com/t/scripting-download-of-file-from-box-com-link/

#Tested on a machine using Windows 10 and powershell 5.1
#Powershell 5.1 download available here if needed.
#https://www.microsoft.com/en-us/download/details.aspx?id=54616

#Create an IE Object
$ie = New-Object -com InternetExplorer.Application

#Allow it to open an IE Window. 
$ie.visible = $true

#Open the website in IE
$ie.navigate("https://cbaa.app.box.com/s/yemq5chyijnstu2foyzv2xucist56riw")

#Wait for the page to load. 
#I added this later due a dely my machine had openingIE.
while($ie.ReadyState -ne 4) { start-sleep -s 10 }

#Grab the proper download button and click it.
#Note: Sometimes this doesn't work correctly, even with the 10 second wait when a window opens up, may have to look for an alternative.
$URL = ($ie.Document.getElementsByTagName("Button") | where { $_.outerhtml -like "*Download File*" }).click()
1 Like