Sport event management seems to be nearly monopolized by ArbiterSports. My lacrosse officiating assignments are administered via Arbiter, and boy oh boy does the UI leave much to be desired.
Not only is the UI lacking, but they charge for additional "mobile" features, like remote/API calendar access, let alone synchronization. So, let's see what we can do to export this calendar information anyway.
Goal
Arbiter supports an "Outlook Export" button which specifies the time range to export a schedule. Clicking this button downloads a CSV:
Start Date,Subject,Start Time,End Date,End Time,All day event,Reminder on/off,Reminder Date,Reminder Time,Description,Location,Priority
"3/12/2018","St. Mark's","5:00 PM","3/12/2018","6:30 PM",FALSE,"False","3/11/2018","5:00 PM","R: John Smith 555-555-5555 || U: Jane Doe 1(666)666-6666 || F: Joe Schmoe 777-777-7777","Saint Mark",Normal
This CSV can be used to import into a Google calendar. So if we can periodically download this CSV and then synchronize it with Google, we'll have our "mobile" features.
For now, let's aim to just download the CSV.
CasperJS
Blindly dealing with HTTP calls and HTML responses can be done in frameworks like Beautiful Soup for Python. But it's limited when JavaScript functions generate required data or manipulate the Document Object Model (DOM) (webpage) without you knowing.
CasperJS to the rescue. It's a headless browser with a JavaScript engine for evaluating HTML and any subsequent scripts. This removes the concern of having to manipulate raw HTML, and instead lets you focus on interacting with webpages to get you what you want.
Without further ado, the script:
var casper = require('casper').create({
viewportSize: {
width: 1920,
height: 1080,
},
logLevel: 'info',
verbose: false,
});
// Shortcuts to commonly used functions
var x = require('casper').selectXPath;
var dump = require('utils').dump;
// CLI parameters: --username=foo --password=bar --organization=NCLRA
var username = casper.cli.options.username;
var password = casper.cli.options.password;
var organization = casper.cli.options.organization;
// Janky global state variable filled on 'resource.requested'
var requestedResource = null;
// Load the Arbiter Sports front page and click 'Log In'
casper.start('http://www.arbitersports.com/', function() {
}).thenClick(x('//*[@id="menu-item-76"]/a'), function() {
// Fill in username and password information, without submitting the form
this.fill(
x('//*[@id="aspnetForm"]'),
{
'ctl00$ContentHolder$pgeSignIn$conSignIn$txtEmail': username,
'txtPassword': password,
}
);
// Submit the form via a click
}).thenClick(x('//*[@id="ctl00_ContentHolder_pgeSignIn_conSignIn_btnSignIn"]'), function(
) {
// Select the organization to target -- unfortunately this is case sensitive
}).thenClick(x('//tr[./td[text() = "'+organization+'"] and .//span[text() = "Official"]]'), function(
) {
// Go to the 'Schedule' tab
}).thenClick(x('//*[@id="lnkNavTabSchedule"]/a'), function(
) {
// Aim to 'Outlook Export' the schedule
}).thenClick(x('//*[@id="ctl00_ContentHolder_pgeGameScheduleEdit_cmnUtilities_tskExport"]'), function() {
/* The 'resource.requested' event hasn't been registered until now because
* there are actually many 'POST's made to a single endpoint within Arbiter.
* It's hard to tell them apart without inspecting individual fields being
* submitted.
*
* So instead just wait to attach a handler until now, expecting that the
* immediately following 'resource.requested' is the 'POST' that represents
* the schedule.
*/
// Set up the fileToDownload event handler
var doOnlyOnce = true;
casper.on('resource.requested', function(resource) {
if (
resource.method === 'POST' &&
resource.url === 'https://www1.arbitersports.com/Official/GameScheduleExport.aspx' &&
doOnlyOnce
) {
doOnlyOnce = false;
requestedResource = resource;
}
});
// You can't 'fill' in a checkbox, so manually uncheck the 'Reminder' box
this.click(x('//*[@id="ctl00_ContentHolder_pgeGameSchedulePrint_conGameSchedulePrint_isEnable"]'));
// Fill in the to and from dates, clicking 'submit'
this.fill(
x('//*[@id="aspnetForm"]'),
{
'ctl00$ContentHolder$pgeGameSchedulePrint$conGameSchedulePrint$txtFromDate': '01/01/'+(new Date()).getFullYear(),
'ctl00$ContentHolder$pgeGameSchedulePrint$conGameSchedulePrint$txtToDate': '12/31/'+(new Date()).getFullYear(),
}
);
}).thenClick(x('//*[@id="ctl00_ContentHolder_pgeGameSchedulePrint_navGameSchedulePrint_BtnExport"]'), function() {
// We captured the requestedResource, replay it with an explicit call to download()
this.download(
requestedResource.url,
'Export.csv',
requestedResource.method, // 'POST'
requestedResource.postData
);
// Once downloaded, gracefully exit the script
this.exit();
});
// Kick off the execution of the script
casper.run();
Breakdown
General statements:
- Event-oriented programming is weird.
- Writing CasperJS is more akin to writing down the steps a human would take when interacting with a page, not a script of instructions.
- Some CasperJS functions will wait for the resource to arrive, and be evaluated before giving you control ... (
start()
,open()
,thenClick()
) - Other CasperJS functions will immediately start executing after its completion, potentially before your resource is ready. (
then()
,click()
) - Global state and registering functions mid-execution is... OK... -ish.
- Man JavaScript is weird.
- XPath selection doesn't make any sense, until it does.
Let's get into interesting bits...
casper
object
var casper = require('casper').create({
viewportSize: {
width: 1920,
height: 1080,
},
logLevel: 'info',
verbose: false,
});
This creates the
casper
object from the Casper module. The dictionary provided to.create()
is also accessible viacasper.settings
once created.Setting the
viewportSize
is required for Arbiter, otherwise it treats you as a mobile device (with too small of a screen, defaulting to 400x300).
Shortcuts
// Shortcuts to commonly used functions
var x = require('casper').selectXPath;
var dump = require('utils').dump;
- It's pretty neat that you can pass functions around as objects.
x()
is used heavily within the script.
--Help
, --help
me Rhonda
// CLI parameters: --username=foo --password=bar --organization=NCLRA
var username = casper.cli.options.username;
var password = casper.cli.options.password;
var organization = casper.cli.options.organization;
- Ain't nobody got time for a
--help
message. Instead, I just expect--username
,--password
, and--organization
to exist.
Global-ler
// Janky global state variable filled on 'resource.requested'
var requestedResource = null;
- I'm not a fan of having to do this, but I can't see how else to pass this object around from function to function without a higher global state.
- I suppose I could performed the majority of this script within another function, and technically that wouldn't then be global... oh well.
start()
casper.start('http://www.arbitersports.com/', function() {
}).thenClick(x('//*[@id="menu-item-76"]/a'), function() {
// Fill in username and password information, without submitting the form
this.fill(
x('//*[@id="aspnetForm"]'),
{
'ctl00$ContentHolder$pgeSignIn$conSignIn$txtEmail': username,
'txtPassword': password,
}
);
// Submit the form via a click
})
start()
will HTTPGET
(configurable) the URL provided, and execute the anonymous function when the page has finished loading.- In this case, and all cases, I've left the anonymous functions blank.
- Once the main page has loaded, click on the "Log In" button.
- This is the first instance of an XPath being used. It targets the proper element via its
id
parameter,menu-item-76
, which should be unique in the entire DOM. - Once this Log In page has loaded, fill in the form's username and password with the
--username
and--password
parameters. - I choose not to "submit" the form because
fill()
doesn't wait for the next resource to become ready. Instead, I eventually callthenClick()
on the submit button, which does wait for the next resource -- just personal preference.
- This is the first instance of an XPath being used. It targets the proper element via its
- I think I would like to try targeting elements by their
value
, and not theirid
. This would allow XPath targets likex('//*[text() = "Log In"]')
, which would work even if the underlyingid
changed -- as long as the textLog In
remained consistent, the script would work.- (I didn't use it because CasperJS wasn't liking the select for some reason. đ¤ˇââī¸)
Core navigation
.thenClick(x('//*[@id="ctl00_ContentHolder_pgeSignIn_conSignIn_btnSignIn"]'), function(
) {
// Select the organization to target -- unfortunately this is case sensitive
}).thenClick(x('//tr[./td[text() = "'+organization+'"] and .//span[text() = "Official"]]'), function(
) {
// Go to the 'Schedule' tab
}).thenClick(x('//*[@id="lnkNavTabSchedule"]/a'), function(
) {
})
- Clicks the "submit" button described in the above section.
- Clicks the proper
organization
row, which identifies your role as an "Official" within Arbiter.- You can be an Official for multiple organizations at once, so you must choose your specific role.
- This Xpath is funky, combining two logical statements:
//tr[./td[text() = "${ORGANIZATION}"]]
: Select atr
element, who has atd
child whosevalue
(identified viatext()
) equals "${ORGANIZATION}
" (the CLI argument), and.//span[text() = "Official"]
: who has a(n)n
th-span
-grandchild, whosevalue
is "Official
".
- Navigates to the "Schedule" tab.
Export form
// Aim to 'Outlook Export' the schedule
.thenClick(x('//*[@id="ctl00_ContentHolder_pgeGameScheduleEdit_cmnUtilities_tskExport"]'), function() {
/* The 'resource.requested' event hasn't been registered until now because
* there are actually many 'POST's made to a single endpoint within Arbiter.
* It's hard to tell them apart without inspecting individual fields being
* submitted.
*
* So instead just wait to attach a handler until now, expecting that the
* immediately following 'resource.requested' is the 'POST' that represents
* the schedule.
*/
// Set up the fileToDownload event handler
var doOnlyOnce = true;
casper.on('resource.requested', function(resource) {
if (
resource.method === 'POST' &&
resource.url === 'https://www1.arbitersports.com/Official/GameScheduleExport.aspx' &&
doOnlyOnce
) {
doOnlyOnce = false;
requestedResource = resource;
}
});
// You can't 'fill' in a checkbox, so manually uncheck the 'Reminder' box
this.click(x('//*[@id="ctl00_ContentHolder_pgeGameSchedulePrint_conGameSchedulePrint_isEnable"]'));
// Fill in the to and from dates, clicking 'submit'
this.fill(
x('//*[@id="aspnetForm"]'),
{
'ctl00$ContentHolder$pgeGameSchedulePrint$conGameSchedulePrint$txtFromDate': '01/01/'+(new Date()).getFullYear(),
'ctl00$ContentHolder$pgeGameSchedulePrint$conGameSchedulePrint$txtToDate': '12/31/'+(new Date()).getFullYear(),
}
);
})
- Once we load the "Export Schedule" page, there is a form with "To" and "From" dates; we'll eventually need to fill them in. There's also a "Reminder" checkbox which we'll want to uncheck.
- Arguably more importantly, however, is the need to register a function when the
resource.requested
event fires.- This is because reasons. Particularly because of ArbiterSports reasons.
- We can't save the
received.resource
for some reasons, so we're a bit janky...
- We record when a resource is requested, and aim to specifically identify the
POST
with our "To", "From", and "Reminder" fields. Once recorded, we can save it back to global state for future use. - There are more
resource.requested
events coming, so we have adoOnlyOnce
flag to help us not execute the callback function on future resources.- I tried unloading it --
this.on('resource.requested', function() {});
-- but that didn't work. âšī¸
- I tried unloading it --
- The year is dynamically specified via the
Date
class.
The home stretch
.thenClick(x('//*[@id="ctl00_ContentHolder_pgeGameSchedulePrint_navGameSchedulePrint_BtnExport"]'), function() {
// We captured the requestedResource, replay it with an explicit call to download()
this.download(
requestedResource.url,
'Export.csv',
requestedResource.method, // 'POST'
requestedResource.postData
);
// Once downloaded, gracefully exit the script
this.exit();
});
// Kick off the execution of the script
casper.run();
- Once we click the "Export" button, we wait for the
resource.requested
event to be captured (above section), and then aim todownload()
it (again). - Re-using much of the data in
requestedResource
, we save it to "Export.csv
", andexit()
gracefully. - To start the entire process, we must
run()
the script.
Closing thoughts
There are more things that could be done here:
- Download the file only once
- Store the file in memory to print, and then exit
I will need another script which persists this information to Google calendar.