The TweetWhisperer

Today is GDG DevFest 2019 in Trinidad. The organizers put out a call for sessions, and I was happy to share one of the ideas that had been rolling around in my head for a while.

I Facebook in pirate, don’t @ me.

So, here’s the TL;DR: my idea was to take my likes on @Twitter and funnel them into Google Keep. Along the way, I’ll automatically categorize the tweets and then confirm that categorization via a chatbot. Simple stuff.

So simple, I didn’t even use Visio to diagram it.

What I actually did:

Twitter Likes

I made an Azure Function that would periodically poll my twitter account and get the last tweets that I liked. To do this, I had to create a developer account in twitter to get the appropriate creds. The function was pretty simple:

Categorizing Likes

In the DotNetConf keynote a few weeks ago, I saw an ML.NET demo and I got the idea to use it here, too.

ML.Net to build models (easy peasy)

All my notes

I pulled all my notes in keep to train an ML model. It was very easy, particularly because I used gkeepapi, an unsupported library for interacting with keep.

Doing this made me glad that I could think in terms of a bunch of cooperating functions, because the function to extract the notes from keep was written in python, while most everything else is in C#.

KeepIt: A function to get my notes from Google Keep

The funny thing is, I didn’t really need the model. Most of the things I stored in keep, were in one or two categories – not very good for modelling. I guess I hadn’t really realized that. To be honest, I was probably storing things in keep that I had a very high priority on, which turned out to be mostly cloud things. Go figure.

How the bot will help change things

So, I’m grabbing my tweets, categorizing them based on history and preference and I’m ready to store them, except, as I’ve shown above, my categorization is whack. Thus, I also made a chatbot, that will take my liked tweets and ask me to adjust the category I’ve labelled it as.

TweetWhisperer: Helping me categorize my tweets better.

So, with these three components, a likes-harvester, a categorizer and a chatbot, maybe, I’ll get better at returning to these topics that I have an interest in.

An unexpected journey

LOL.

My family was going to the beach a few weekends ago, and so I decided to try and catch some miles on the way there. Literally, I’d jump out of the car just before we got there and jog the rest of the way.

We were going to Las Cuevas beach, which for some reason I thought was a short distance from Maracas beach. It’s not. It’s really not.

I mean, I try and do a long run on the weekend, so an extra mile on my usual 6 shouldn’t have been too arduous. However, Maracas Bay Village is the most hilly place I’ve ever run!

So, four miles in I was exhausted, but thankfully, I had my buds and my podcasts were on fire. Recently, I was interviewed on Tim Bourguignon’s Software Developer’s Journey podcast, so I subscribed to it and it turns out to be just what you need when you’re on a long run with no end in sight. Tim uses interviews to tease out compelling, interesting developer journeys. Which at times can feel just like my run felt. It was so good to have that distraction.

I was able to finish two episodes, one featuring Stephanie Hurlburt, which if I had to rename would be, The Importance of Balance. And one featuring Guillermo Rauch.

Just when I would have pulled a Geoffrey and jumped in a car to get the rest of way, I saw an old man who said Las Cuevas was one ridge over and it was!

Lightning in a Hansard bottle

Some of the technology team that brings the Hansard online in Trinidad & Tobago

When we built the Hansard Speaks chatbot in 2017, I was super excited and told all my friends about it. One of them now works in IT at the Parliament and he invited me to talk with the team about it.

At the brief talk, I spoke about the motivations for building the chatbot, how we thought it was a great way to win arguments about who said what in Parliament, and that we liked how easy it was to bring an automated conversational experience into that world.

I think the team at the Parliament does a great job. I’ve always liked that they were among the early movers to bringing Internet technology into governance. They’ve been online for a long time, they make copies of the Hansard available on their site and they stream proceedings. They’re also on Twitter and are pretty active.

We spoke about how much Hansard Speaks leverages cloud technology and the fact that though the government is progressing, the policy on public cloud means that they have to find ways to use on-prem tech to accomplish similar functionality. HS uses the Microsoft Bot Framework, Azure Media Services and Azure App Services. If they wanted to do that off the cloud, they could but it would just be a bit harder.

I’d love if they shared more about what they do, in terms of all the teams that go in to making the Hansard come alive. There’s a real production team around ensuring those documents get generated and that the parliament team can handle requests that come down from MPs about who said what, when.

Since it’s been two years after we first did the chatbot, I described to them one key change we might make if we were doing it again. We would use a tool like Video Indexer on the videos they create.

Video Indexer’s insights on a Parliament video from Trinidad and Tobago.

It would let us do more than simply get a transcript, as we would then be able to get information on who said what, and how much they contributed to a session. We would be able to get key topics that were discussed.

So, it was great to speak with some of the guys behind the Hansard and share with them ideas on services they can leverage to make their offering even more compelling, insightful and relevant.

Learning to scale.

My friend Christopher shared this on Facebook. It’s a queue for CXC results.

Tomfoolery. That’s the adjective that bubbled into my mind when I saw this screenshot and understood what it was describing.

Admittedly, this was a bit harsh. I mean, someone built this, they took the time to craft a way to deal with the rush of users trying to get their examination results from the Caribbean eXamination Council (CXC). They (CXC) have a classic seasonal scale problem.

During the year, the number of users hitting their servers is probably at a manageable level. And when results come up, they probably literally see a 10-fold increase of access request. They’d have the numbers, but I think this guess is reasonable.

I was curious about the traffic data on cxc.org, so I got this from similarweb.com

Their solution may have been reasonable in 2012, when maybe scaling on the web was more of a step-wise move in terms of resource management than a flexible thing as it is today. Scaling may have meant buying new servers and software and getting a wider team to manage it all.

But there are solutions today that don’t necessarily mean an whole new IT investment.

SAD sporting
I really dig this chart from Forbes demonstrating the seasonality challenge

So, how do you contend with seasonal IT resource demand without breaking the bank? Design your solution to leverage “The Cloud” when you need that extra burst.

“The Cloud” in quotes because if things were as easy as writing the line, “leverage The Cloud” then we wouldn’t be this deep on a blog post about it. So, specifics, here’s what I mean:

Plan the resources needed. In this case, it might be a solution that uses load balancing where some resource is handled on-prem and capacity from the cloud is used as needed. Alternatively, an whole new approach to sharing the resources at play might be worth investigating – keeping authentication local, but sending users to access results stored in the cloud via a serverless solution is a great consideration.

I don’t want to spec out the entire solution in two paragraphs. I do want to say CXC can and should do better, since they serve the needs of much of the Caribbean and beyond.

One Marvelous Scene – Tony & Jarvis

My video editing skills are what you might call, “Hello World” level. So, I’m not interested in bringing down the average as it were, in terms of the quality content over on the official “One Marvelous Scene” playlist on YouTube.

I dig what those storytellers did a lot. But as far as I can tell, even though there are 67 videos on that list, they missed a key one. It was from when Tony had returned home in Iron Man 1 and was building out the new suit, working with Jarvis.

Back when IM1 came out, I remember being delighted by the movie, just for the flow of it, just as a young developer then, seeing a portrayal of a “good programmer” was refreshing. And he was getting things wrong. Making measuring mistakes or pushing things out too quickly. He had to return to the drawing board a lot. It was messy. Just like writing code.

And while doing this, he was working along with Jarvis. An AI helper with “personality”. One that he could talk to in a normal way and move his work along.

In 2008, that was still viewed as fanciful, in terms of a developer experience, but even then I was somewhat seriously considering its potential, which is why I even asked about voice commands for VS on StackOverflow 👇🏾

Jump forward to 2019 and bot framework is a thing, the StackOverflow bot is also a thing and Team Visual Studio introduced IntelliCode as this year’s build conference.

So, as a scene in a movie, Tony talking with Jarvis helps us understand Tony’s own creative process, but for me, it gave glimpses into the near future for how I could be building applications. And that’s marvelous.

A short note on value-based pricing

This is a short note because I’m not an economist and don’t pretend to be one. I think areas like pricing and value and cost determination are complex and should be given their just consideration, however, I recently saw a question on Caribbean Developers and I wanted to share what insights I have on it.

When I first saw this, I knew immediately I wanted to say, “Aye Roger, lean away from thinking hourly rates”, but that felt a bit too curt.

It’s been my experience that freelance developers tend to think in terms of charging hourly rates and doing work that they cost based on that rate. The “better” or more experienced they get, the more the rate reflects their growth/maturity in the space. Then, they learn about markups, based on their understanding of customer risk and that good stuff. But I’ve come to appreciate that thinking in terms of hourly rates for the work you do in freelance is a trap. The actual term for what I think is a better approach is value-based pricing.

I found a great explanation of it on quora:

Value-Based Pricing means presenting a price to the purchaser which is based on their perception of the value they will derive from the result being discussed

More from David Winch, Pricing Coach on quora.

So, I think Roger needs to spend time understanding how to develop a firm that uses Value-based pricing. It’ll help much more in the long run.

After Roger posted his question, I saw a great point from @sehelburt on the matter too:

Jumping over hurdles to get to insights using Azure

On the way to getting some data to an Azure DB, I explored strategy in using entity framework that was such a cool timesaver, I thought I’d share it for #GlobalAzureBootcamp

I could have done this years ago. It’s just that sometimes a task feels so complex, daunting or mind-numbingly boring that you make do with alternatives until you just have to bite the bullet.

We work in SMPP at Teleios. It’s one of the ways you can connect to mobile carriers and exchange messages.  From time to time, we need to analyze the SMPP traffic we’re sending/receiving and typically, we use WireShark for this. That’s as easy as setting it up to listen on a port and then filtering for the protocol we care about. Instead of actively monitoring with WireShark, we may at times use dumpcap to get chunks of files with network traffic for analysis later on.

What we found was that analyzing file by file was tedious, we wanted to analyze a collection of files at once. We’d known of cloud-based capture analysis solutions for a while, but they tended to focus on TCP and maybe HTTP. Our solution needed to be SMPP-based. So, we decided to build a rudimentary SMPP-based database to help with packet analysis.

That’s where the mind-numbingly boring work came in. In the SMPP 3.4 spec, a given SMPP PDU can contain 150 fields. The need for analysis en masse was so great that we had to construct that database. But this is where the ease of using modern tools jumped in.

I got a list of the SMPP fields from the WireShark site.  In the past, I would have then gone about crafting the SQL that represented those fields as columns in a table. But now, in the age of ORM, I made a class in C#. And if you’re following along from the top, I created a project in Visual Studio, turned on Entity Framework migrations and added a DataContext. From there, it was plain sailing, as I just needed to introduce my new class to my DataContext and push that migration to my db.

It probably took me 30 minutes to go from the list of 150 fields on WireShark to being able to  create the database without the necessary tables. Now, where does Azure come into all of this?

Each capture file we collect could contain thousands of messages. So, in my local development environment, when I first tried to ingest all those to my database, my progress bar told me I’d be waiting a few days.  With Azure, I rapidly provisioned a database and had it’s structure in place with this gem:

Database.Migrate();

That is, from the Entity Framework DataContext class, I called the Database.Migrate() method and any newly set up db would have my latest SMPP table. From there, I provisioned my first 400 DTU Azure SQL database and saw my progress bar for ingestion drop from days to hours.

With a database that size, queries over thousands of rows went by reasonably fast and gave us confidence that we’re on the right path.

We’re also working on a pipeline that automates ingestion of new capture files, very similar to what I did last year my Azure Container Instances project.

So, for #GlobalAzureBootcamp 2019, I’m glad that we were able to step over the hurdle of tedium into richer insights in our message processing.