C# Archive - CraftCoders.app https://craftcoders.app/category/c-sharp/ Jira and Confluence apps Wed, 14 Aug 2024 12:27:55 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.3 https://craftcoders.app/wp-content/uploads/2020/02/cropped-craftcoders-blue-logo-1-32x32.png C# Archive - CraftCoders.app https://craftcoders.app/category/c-sharp/ 32 32 Do it yourself filesystem with Dokan https://craftcoders.app/do-it-yourself-filesystem-with-dokan/ Mon, 13 Aug 2018 08:00:00 +0000 https://craftcoders.app/?p=498 Read More]]> What’s up guys?! A week has passed and its time for a new Blogpost. This time I am gonna give you a small introduction to Dokan. You don’t have a clue? Never heard of Dokan before? No problem… I haven’t either. But in the life as a student there comes the moment where one must write his Bachelor thesis. No matter how much you procrastinate. As a part of my thesis, I had to write a filesystem and this is exactly where Dokan came into play to save my ass.

WHAT IN THE HELL IS DOKAN?!

So let’s start from the beginning. As I mentioned before, I had to implement my own Filesystem. Yeah, basically you could write your own filesystem driver. But that would be like writing a compiler to print a simple “Hello World”. But there is a really cool concept which is heavily used in the Linux world. Its called FUSE (Filesystem in Userspace). With FUSE everyone is able to create their own Filesystem without writing a Kernel component. FUSE empowers you to write a Filesystem in the same manner as a user-mode application. So what is Dokan?! Dokan is simply FUSE for Windows. It is as simple as that. You could even use Dokan to run a filesystem which has been written with FUSE under windows.

Okay cool… How does this magic work?!

So you are right. Without a kernel component, there is no way to implement a filesystem. But you don’t have to write it, because Dokan did. So Dokan ships with two components: (1) dokan1.sys alias “Dokan File System Driver” (2) dokan1.dll which is used in a “Filesystem Application”. So take a look at the picture below

First, a random Application is running. It could be Word, Visual Studio, IntelliJ or your web browser rendering a website that is trying to write a virus into a file ?. Let’s assume it is the web browser. If the web browser tries to write some content x to a file y its gonna fire a I/O-request.
This I/O-request is processed by the Windows I/O Subsystem. Note that by passing the I/O-request to the Windows I/O Subsystem we leave the user mode and enter the kernel mode of Windows (this is 1 in the picture above).

Secondly, the Windows I/O Subsystem will delegate the I/O-request to a Driver responsible for the filesystem. In our case that would be the Dokan File System Driver. Which is dokan1.sys. Please note that we did not write any code for that driver, it just needs to be installed (this is 2 in the picture above).

Third, our Filesystem Application which has registered itself in the Dokan File System Driver gets notified about the I/O-Request. By implementing the interface which comes in the dokan1.dll, our Filesystem Application is now responsible for computing the I/O-Request. Whatever has to be done needs to be done by our Filesystem Application. And as you might already guess: Yes this is the part we need to write! The Filesystem Application than invokes a callback function and the Dokan File System Driver is back in line (this is 3-4 in the picture).

Last but not least, the Dokan File System Driver receives the I/O-Response created by our Filesystem Application and invokes the callback routine of the Windows I/O Subsystem. The Windows I/O Subsystem than forwards the result to the application which created the I/O-Request. In our case the Web browser with the porno site (this is 5-6 in the picture).

Just do it! Writing a simple Filesystem

Scope

Okay, we are actually not going to implement a complete Filesystem ?. That would be too much for a blog post. We are doing something simple. Let’s create a Filesystem that contains a fake file which can be read with a editor.

Warmup: Preparations

As I already mentioned before, we need to install the Dokan File System Driver. It is used as a proxy for our Filesystem Application. You can get the latest Version here.
As soon as we have the Dokan File System Driver installed, we can create a blank C# Console Application. Please note that you could also use Dokan with other languages. As always, you can find my solution on GitHub. After we’ve created the Console Application, which will be our Filesystem Application, we need to add the Dokan library (dokan1.dll). Luckily there is a NuGet package.
Everything settled? Let the game begin!

Mount the FS

First of all, we need to implement the IDokanOperations from the dokan1.dll. Since this is for learning purposes I didn’t create a second class. So everything is in one class. In the Main()-Method I create a new instance of the class and mount the Filesystem.

static void Main(string[] args)
{
    var m = new StupidFS();
    // mounting point, dokan options, num of threads
    m.Mount("s:\\", DokanOptions.DebugMode, 5);
}

1..2..3.. and it crashed! What happened? As you can see from the Console output, several I/O-request failed. First, the GetVolumeInformation-Operation failed and then the Mounted-Operation. They failed because we did not implement them yet. But it’s simple: in the GetVolumeInformation-Request we just need to provide some information for the OS. Basically, this is just some Meta information for our filesystem. Like its name, how long a path can get and which feature it supports. Let’s implement it:

[...]

public NtStatus GetVolumeInformation(...)
{
       volumeLabel = "CraftCode Crew";
       features = FileSystemFeatures.None;
       fileSystemName = "CCCFS";
       maximumComponentLength = 256;

       return DokanResult.Success;
}

[...] 

public NtStatus Mounted(DokanFileInfo info)
{
       return DokanResult.Success;
}

But it won’t work yet. We also need to “implement” the CreateFile-Methode:

public NtStatus CreateFile(...)
{
      return DokanResult.Success;
}

Did you notice how every method returns a NtStatus? This status indicates wherever a request has failed (and why) or succeeded. You might wonder why we need to return a Success in the CreateFile-Methode for mounting the filesystem. As soon as we mount the filesystem, it tries to create some fileproxis. If we throw an exception, our filesystem ends up in a bad state.

Faking a file

Whenever the filesystem has to list files there a two possible request: FindFilesWithPattern and FindFiles. Luckily, we just need to implement one and suppress the other one. We are going to implement the FindFiles-Methode. Therefore we will return a DokanResult.NotImplemented in FindFilesWithPattern, so whenever the filesystem gets this request it will be rerouted to FindFiles.
One of the parameters in FindFiles is a list of FileInformation objects. We are just going to fake one item and add it to a list which will be set to the parameter files.

public NtStatus FindFilesWithPattern(...)
{
   files = null;

   return DokanResult.NotImplemented;
}

public NtStatus FindFiles(...)
{
   var  fileInfo = new FileInformation
   {
         FileName = "carftCodeCrew.txt",
         CreationTime = DateTime.Now,
         LastAccessTime = DateTime.Now,
         LastWriteTime = DateTime.Now,
         Length =  FileContent.Length * 8,
   };

   files = new List<FileInformation> {fileInfo};

   return DokanResult.Success;
}

And we did it! Our filesystem now shows one file!
Did you notice FileContent? It’s just a global string containing the content of our textfile.

Reading with the good old Editor

Let’s read data from our filesystem! We want to read the string FileContent with the good old Editor. So, first of all, we need to make changes to the CreateFile-Methode. Whenever we need to open a file or dir, the CreateFile-Methode gets invoked. We need to provide the DokanFileInfo object with a context. In case of a read operation, the context is a stream where the data is located. Since we want to read a string, we are going to use a MemoryStream.

public NtStatus CreateFile(...)
{
     if (fileName.Equals(@"\carftCodeCrew.txt"))
     {
       info.Context = new MemoryStream(System.Text.Encoding.ASCII.GetBytes(FileContent));
     }

       return DokanResult.Success;
}

We are close but not quite there. When an application like Editor tries to open a file it also wants to read the meta information of the file. For example, to set the window title. Therefore we need to implement the GetFileInformation-Methode. Since our filesystem just has one file it is really trivial:

public NtStatus GetFileInformation(...)
{
     fileInfo = new FileInformation
     {
            FileName = "carftCodeCrew.txt",
            Attributes = 0,
            CreationTime = DateTime.Now,
            LastAccessTime = DateTime.Now,
            LastWriteTime = DateTime.Now,
            Length =  FileContent.Length * 8,
     };

     return DokanResult.Success;
}

Now we are really close ? We just need to implement the ReadFile-Methode(). In this method, we get the Stream from the DokanFileInfo.Context and then read the bytes that have been requested. It is really as simple as this.

public NtStatus ReadFile(...)
{
      bytesRead = 0;

      if (info.Context is MemoryStream stream)
      {
          stream.Position = offset;
          bytesRead = stream.Read(buffer, 0, buffer.Length);
      }

      return DokanResult.Success;
}

The lovely CraftCodeCrewFS

]]>
Xamarin: Tabbed Page Navigation https://craftcoders.app/xamarin-tabbed-page-navigation/ https://craftcoders.app/xamarin-tabbed-page-navigation/#respond Mon, 25 Jun 2018 08:00:56 +0000 https://billigeplaetze.com/?p=92 Read More]]> If you have a background in native mobile development and for some reason had to switch to Xamarin, you are probably familiar with that “oh yeah, it doesn’t work this way here” feeling. Having to find new solutions for problems you haven’t even considered to be problems before is what I would call business as usual in Xamarin. Call me a weirdo, but that is also what made me enjoy learning the technology so much.

The project I had to dive into cross-platform development with, was a big adventure on its own – a very ambitious idea, a rather vague picture of the expected result (and therefore constantly changing requirements), a critical lack of time and the total of 3 developers, all rather new to the technology. Time pressure, general cluelessness about Xamarin and no expert looking over our shoulder turned into hours of online research, in a desperate hope that someone has already encountered the exact same issue. I know that, as any other developer, you have been there yourself, and I don’t have to tell you about that inexplicable relief of finding what you were looking for. Just as much as I don’t have to elaborate on the frustration of trying to fix the problem that you’ve created, trying to fix the problem that you’ve created, trying to fix the problem…

One of the issues, which has nearly caused such infinite loop of research, was connected to building up a pretty common navigation concept. Here is what we were trying to achieve:

After logging in, a user should see their dashboard. From there they can navigate by switching between individual tabs. Those are always present on the bottom of the screen, “framing” any active content. Each tab has it’s own navigation tree, meaning one can go back and forth between pages inside any tab. If a user decides to switch between tabs, the app remembers current position in all the trees. This way one can always pick up from where they left the tab, no matter what has happened in the meantime.

This navigation concept has been around for a while, and you could probably name a couple of apps using it off the top of your head. However, just like I have mentioned earlier, things in Xamarin are not necessarily what, where and how you are used to. It’s not rocket science, but if you are new to the platform, this little tab page navigation tutorial could hopefully save you some headache. Let’s dive right into it.

Xamarin Navigation Basics

1. Defining MainPage
The root page of your application is defined in App.xaml.cs, which is a part of the shared code. Here you can tell your app, which page to display on start.

public App()
{
    InitializeComponent();
    MainPage = new MyMainPage();
}   

2. Hierarchical navigation
Hierarchical navigation is the most basic way to move between pages in Xamarin. You can think of it as a stack of cards in solitaire. The first card in the stack is your root page. You can put new pages (or cards) on top of it in any order you like, as long as you comply with a couple of predefined rules. In a solitaire, these rules would tell you, say, to only put lower ranked cards on top of higher ranked ones. In a Xamarin app, the rules define legitimate navigation paths. They are nested in the code and influence the way a user can interact with your app.

So how does it look from a less abstract, more code-oriented perspective? Your navigation tree is represented by a NavigationPage. You can pass the root page of the tree as a constructor argument, like this:

var myNavigationPage = new NavigationPage(new MyPage());

Now that you have your base card on the table, you can add some more on top of it, by performing a push operation. In this example, you push a new page on the navigation stack after a button is clicked:

private async void Handle_MyButtonClicked(object sender, EventArgs e)
{
    await Navigation.PushAsync(new MyOtherPage());
}

If you want the top page of the stack to disappear and show the page below, do so by calling the pop method of Navigation property:

private async void Handle_DismissButtonClicked(object sender, EventArgs e)
{
    await Navigation.PopAsync();
}

Now that you can manipulate your stack, it’s time to check out, how tabbed pages work in Xamarin.

Tabbed page navigation

Step 1. Creating a Tabbed Page

public partial class MyTabbedPage : TabbedPage
{
    public MyTabbedPage()
    {
        // Your constructor code here
    }
}

All we do here is inherit from a TabbedPage class, which is provided by Xamarin. It is important to remember, that tabs will not render the exact same way on different platforms. On iOS, they will show up at the bottom of the screen, beneath the actual page content. Android, on the contrary, displays tabs on the top of the page. There are many more differences in tab behavior across the two platforms, the most important ones covered here.

Note: For our scenario, we had to go for a unified look (icon tabs on the bottom) for both, iOS and Android. If this is what you are going for as well, check out BottomTabbedPage control by Naxam.

Step 2. Making TabbedPage to your root
All you are doing here is telling your App to display the tabbed page you’ve created in the previous step on start. Use MainPage property to do so.

public App()
{
    InitializeComponent();
    MainPage = new MyTabbedPage();
}   

Step 3. Filling the tabs
How many tabs is your app going to have and what are they going to be? When you’ve decided on the navigation concept and created all the necessary pages, all you need to do is pass them to the tabbed page.

Let’s go back to our solitaire metaphor for this one. Imagine, that your game has a row of three cards in the beginning. Maybe, two of those are the basis for stacks a player needs to build, and the third one is just a single card, pointing at a trump. How do you create an analogy of this constellation within a tabbed page?

The game field you lay out your cards on is your tabbed page. It should display three cards in one row so you will need the total of three tabs. One of them is very simple since it only has one card in it. Nothing goes on top of that card, and the player does not have the power to alter it. For our app, it means that no navigation takes place inside the tab. This behavior is going to be represented by a ContentPage.

The other two cards have a more complex role. They will become stack bases. Cards are going to be put on top of them, and, maybe, taken back off. For our pages, this means that they are going to become roots of their own navigation trees, which users will interact with. By pressing buttons or swiping, they will travel from one page to another, pushing them on or popping them off the stack. To enable this, we will need a NavigationPage class to wrap around our root page. It will allow us to use the push and pop methods and let our users travel up and down the stack.

An important notice here: you should not wrap TabbedPage itself into a NavigationPage. iOS does not support this, and you are going to run into troubles trying to do so. Your concept should base on populating a TabbedPage with ContentPages and NavigationPages only.

Finally, you can add content and navigation pages to the TabbedPage using its Children property:

public Tabbs()
{
    // Your constructor code here
    Children.Add(new NavigationPage(new MyPage());
    Children.Add(new NavigationPage(new MyOtherPage());
    Children.Add(new MyContentPage());
}

Step 4. Manipulating the stacks
NavigationPages give you a couple of methods to work with the stack. We are only going to look closely at the two most basic ones today: PushAsync() and PopAsync(). These methods can be called on the Navigation property of a Page and allow you to put a new card on top of your stack or remove the top card.

Any class, which derives from Page provides you with a Navigation property. Sometimes, however, it might be useful to manipulate the navigation tree from outside the pages themselves. In order to expose the tree to other classes in the app, you might consider declaring a public variable in your App class:

public static INavigation Nav { get; set; }
public App()
{
    InitializeComponent();
    //...
    var myNavigationPage = new NavigationPage(new MyPage());
    Nav = myNavigationPage.Navigation;
}

After you took care of this, you can access the navigation tree from anywhere inside of your app:

private async void Handle_SomeNavigationButtonClicked(object sender, EventArgs e)
{
    await App.Nav.PushAsync(new MyOtherPage());
}

While this step can be useful regardless of the navigation concept of your application, it was absolutely crucial for our multiple-navigation-stacks solution, because we needed to know which branch of our navigation tree we are on at all times.

Step 5. Knowing where you are
How do you let the App know which tab is currently active so that the pages get pushed to the right stack? One way to achieve this would be to let TabbedPage take care of changes by overriding its OnCurrentPageChanged() method:

public partial class MyTabbedPage : TabbedPage
{
    public MyTabbedPage()
    {
        // Your constructor here 
    }

    protected override void OnCurrentPageChanged()
    {
        base.OnCurrentPageChanged();
        App.Nav = CurrentPage.Navigation;
    }
}

Now, whenever you are calling the Navigation property using App class, the active stack is going to be chosen.

Extras

The basic idea of navigation is simple. You have an overall context, your App class, which has the information about your current location and provides a mechanism to interact with it. You can use this mechanism from anywhere inside the application to go forward and backward in the navigation stack. Classes requesting such manipulations do not have any knowledge about the selected tab or current position in its navigation stack. All they need to do is ask the App to perform a specific operation on the right stack.

Depending on your project, you might need some extra functionalities in connection to your TabbedPage. Here are two optional steps we had to use in our app:

Step 6.(Optional) Reset all stacks
Using the concept of separate navigation trees inside a TabbedPage, you might also want to be able to reset all of them at once. For instance, if a user logs out, you would not want the app to keep all the navigation stacks for the next logged in user to see. What you would need here is to call PopToRootAsync() method on the Navigation property of each one of your TabbedPage children.

Step 7. (Optional) Set selected tab
Sometimes it can also be useful to define selected tab programmatically. In our case, a user should always land on a specific tab after login. To achieve this you can set the CurrentPage property of your TabbedPage.

I guess that would be it for our short dive into the unwonted peculiarities of cross-platform development. I hope this post was useful for some of you Xamarin warriors out there. Just like any of us here at Billige Plätze, I would be very happy to hear from you in the comment section below. Any feedback is appreciated.

Let’s learn from each other.
Dannynator.

]]>
https://craftcoders.app/xamarin-tabbed-page-navigation/feed/ 0
AI as a Service – a Field Report https://craftcoders.app/ai-as-a-service/ https://craftcoders.app/ai-as-a-service/#respond Mon, 11 Jun 2018 14:20:06 +0000 https://billigeplaetze.com/?p=82 Read More]]> In this blog post I describe my experiences with AI services from Microsoft and how we (team Billige Plätze) were able to create hackathon award winning prototypes in roughly 24 hours with them. The post is structured according to the hackathons we participated and used AI services in, as of now (11.06.2018) there are 2 hackathons where we used AI services, but im sure there are a lot more to come in the future. The first one was the BlackForest Hackathon which took place in autumn 2017 in Offenburg and the second one was the Zeiss hackathon in Munich, which took place in January 2018.
This post is not intended to be a guide on integrating said services (Microsoft has nice documentations of all their products 🙂 ), but rather a field report on how these services can be used to actualize cool use cases.

Artificial Intelligence as a Service (AIaaS) is third-party offering of artificial intelligence, accessible via a API. So, people get to take advantage of AI without spending too much money, or if you’re lucky and a student, no money at all.

The Microsoft AI Platform

As already mentioned above, we have used several Microsoft services to build our prototypes, including several Microsoft Cognitive Services and the Azure bot service. All of them are part of the Microsoft AI platform, where you can find services, infrastructure and tools for Machine Learning and Artificial Intelligence you can use for your personal or business projects.

we used only some of the services of the AI platform

BlackForest Hackathon

The BlackForest Hackathon has been the first hackathon ever for our team and we have been quite excited to participate.

The theme proposed by the organizers of the hackathon was “your personal digital assistant”, and after some time brainstorming we came up with the idea of creating an intelligent bot, which assists you with creating your learning schedule. Most of us are serious procrastinators (including me :P), so we thought that such a bot can help us stick to our learning schedule and motivate us along the way to the exam.
The features we wanted to implement for our prototype are

  • asking the user about his habits (usual breakfast, lunch and dinner time as well as sleep schedule),
  • asking the user about his due exams (lecture, date and the amount of time the user wants to learn for the exam) and
  • automatic creation of learning appointments, based on the user input, within the users Google calendar.

With the topic of the hackathon in mind we wanted the bot to gather the user input via a dialog as natural as possible.

Billige Plätze in Action

The technology stack

Cem and I have experimented a little bit with the Azure Bot Service before the hackathon, and we thought it to be a perfect match for the task of writing the bot. We also wanted the bot to process natural language and stumbled upon LUIS, a machine learning based service for natural language processing, which can be integrated seamlessly into the bot framework (because it is from Microsoft, too).
Our stack consisted of, as expected, mainly Microsoft technologies. We used

  • C#,
  • .Net core,
  • Visual Studio,
  • Azure Bot Service,
  • LUIS and
  • Azure.

The combination of C#, the bot service and LUIS provided the core functionality of our bot and we were able to deploy it to Azure within one click.

The Azure Bot Service

The Bot Service provides an integrated environment that is purpose-built for bot development, enabling you to build, connect, test, deploy, and manage intelligent bots, all from one place. Bot Service leverages the Bot Builder SDK with support for .NET and Node.js.

Overview of the Bot Service

The bot service consists of the concepts of

  • channels, which connect the bot service with a messaging platform of your choice,
  • the bot connector, which connects your actual bot code with one or more channels and handles the message exchange between the channels and the bot via
  • activity objects.

Dialogs, another core concept, help organize the logic in your bot and manage conversation flow. Dialogs are arranged in a stack, and the top dialog in the stack processes all incoming messages until it is closed or a different dialog is invoked.

General Conversation flow of the Bot Service

By using the bot service we were able to focus on programming the actual conversation with the Bot Builder SDK. To use the bot service you just have to create a new bot in Azure, and connect to the channels of your choice (Telegram worked like a charm) also via the web app.
After creating your bot in Azure you can start coding right away by using a template provided by Visual Studio, you just have to type in your bot credentials in the configuration file and your good to go. Because we didn’t have to worry about where to host the bot and how to set it up we were able to quickly create a series of dialogs (which involved serious copy pasting :P) and test our conversation flow right away by using the botframework emulator, and when we were happy with the results publish the bot on Azure within one click in Visual Studio.
We didn’t have to worry about getting the user input from the messaging platform and integrating natural language understanding into our bot was very easy, because we used Microsoft LUIS. We were seriously impressed by the simplicity of the bot service.

Microsoft LUIS

LUIS is a machine learning-based service to build natural language into apps, bots, and IoT devices.
You can create your own LUIS app on the LUIS website. A LUIS app is basically a language model designed by you, specific for your domain. Your model is composed of utterances, intents and entities, whereas utterances are example phrases users could type into your bot, intents are the users intention you want to extract from the utterance and entities represent relevant information of the utterance, much like variables in a programming language.
An example for an utterance from our bot could be “I write my exam on the 23rd of August 2018”. Based on the utterance, our model is able to extract the intent “ExamDateCreation” as well as the entity “Date” <– 23.08.2018 (for a more detailed explanation, visit the LUIS Documentation). Once you define all your intents and entities needed for your domain in the LUIS web application and provide enough sample utterances, you can test and publish your LUIS app. After publishing you can access your app via a REST API, or in our case through the Azure bot service.

picture of the LUIS web application

the LUIS web application

LUIS is tightly integrated in the bot service, to integrate our model all we had to do was to add an annotation to a class and extend the prefabricated LuisDialog to get access to our intents and entities.

[Serializable]
[LuisModel("your-application-id", "your-api-key")]
public class ExamSubjectLuisDialog : LuisDialog<object>
{
  [LuisIntent("Klausurfach")]
  private async Task ExamPowerInput(IDialogContext context, IAwaitable<object> result,
  Microsoft.Bot.Builder.Luis.Models.LuisResult luisResult)
        {
        ...
        }
}

Theres nothing else to do to integrate LUIS into your bot. The bot connector service handles the message exchange between your bot code and the LUIS REST API and converts the JSON into C# objects you can directly use. Fun fact: the integration of the google calendar into our bot took us several hours, a lot of nerves and around 300 lines of code, whereas the integration of our LUIS model took around 5 minutes and the lines of code for every LuisDialog we created.

Summary

By using the Azure bot service in combination with a custom LUIS model we were able to create a functional prototype of a conversational bot assisting you in creating your custom learning schedule by adding appointments to your Google calendar in roughly 24 hours, all while being able to understand and process natural language. With the power of the bot service, the bot is available on a number of channels, including Telegram, Slack, Facebook Messenger and Cortana.

It was a real pleasure to use these technologies because they work seamlessly together. Being able to use ready-to-use dialogs for LUIS sped up the development process enormously, as well as being able to deploy the bot on azure within one click out of Visual Studio. I included a little demonstration video of the bot below, because it is no longer in operation.

demo part 2

Zeiss Hackathon Munich

Our second Hackathon we participated in took place in January 2018. It was organized by Zeiss and sponsored by Microsoft.

The theme of this hackathon was VISIONary Ideas wanted, and most of the teams did something with VR/AR. One of the special guests of the hackathon was a teenager with a congenital defect resulting in him to be left with only about 10% of his eye sight. The question asked was “can AI improve his life?”, so we sat down with him and asked him about his daily struggle being almost blind. One problem he faces regularly, and also according to our internet research other blind people, is the withdrawal of cash from an ATM.
So after some time brainstorming we came up with the idea of a simple Android app which guides you through the withdrawal process via auditory guidance. Almost everyone has a smartphone, and the disability support for them is actually pretty good, so a simple app is a pretty good option for a little, accessible life improvement.

We figured out 3 essential things we needed to develop our app:
1. Image Classification, to distinguish between different types of ATM (for our prototype we focused on differentiating ATMs with touch screen and ATMs with buttons on the sides),
2. Optical Character Recognition, to read the text on the screen of the ATM to detect the stage of the withdrawal process the user is in and generate auditory commands via
3. Text to Speech, which comes out of the box in Android.

The Technology Stack

We wanted to develop an Android app, so we used Android in combination with Android Studio. We chose to develop an Android app simply because some of us were familiar with it and I always wanted to do something with Android.

For both the image classification and the OCR we again relied on Microsoft Services.
The Microsoft Vision API provides a REST endpoint for OCR, so we got ourselves a API key and we were ready to go.

For the image classification Microsoft provides the custom vision service, where you can train your own model.

Plugging the parts together

The flow of our app is quite simple, it takes a picture every 5 seconds, converts it into a byte array and first sends it to our custom image classifier to detect the ATM type. After the succesful classification of the ATM all further taken images are sent to the vision API for optical character recognition. We get a JSON as a response with all the text the vision API was able to extract from the image. The app then matches keywords, we focused on the flow of ATMs from Sparkasse, with the extracted text to detect the current stage of the withdrawal process and then create auditory commands via text to speech. We didn’t even have to rely on frameworks by Microsoft like in the first hackathon, all we needed was to call the REST API of the services and process the response.

Summary

Much like in the first hackathon we were really impressed how good the services work out of the box. As you can imagine the images of the ATM were oftentimes pretty shaky, but the OCR worked pretty good nevertheless. And because we only matched important, but distinct, keywords of every stage of the withdrawal process, the app could handle wrongly extracted sentences to a good degree. Our custom image classifier was able to differentiate between ATMs with touch screens and ATMs with buttons pretty reliable with only 18(!) sample pictures.

our custom ATM type classifier

After the 24 hours of coding (-2 hours of sleep :/ ), we had a functioning prototype and were able to pitch our idea to the jury with a “living ATM” played by Danny :).

The jury was impressed by the prototype and our idea of a “analog” pitch (we had no PowerPoint at all) and awarded us the Microsoft price. We all got a brand new Xbox One X, a book from Satya Nadalla and an IoT kit, which was pretty cool (sorry for bragging :P).

Billige Plätze and their new Xboxes

Takeaways

I think there are 3 main points you can take away from this blog post and my experiences with AI as a service.

  1. You can create working prototypes in less than 24 hours
  2. You don’t have to start from scratch
  3. It is not a shame to use finished models. When you need more, there is always time!

Before using these services I thought AI to be very time-consuming and difficult to get right. But we were able to create working prototypes in less than 24 hours! The models provided by Microsoft are very good and you can integrate them seamlessly into your application, be it in your conversational bot or in your Android App.
All projects are available in our GitHub organization, but beware, the code is very dirty, as usual for hackathons!

I hope I was able to inspire you to create your own application using AI services, and take away some of your fears of the whole AI and Machine Learning thing.

Have fun writing your own application and see you in our next blog post!

]]>
https://craftcoders.app/ai-as-a-service/feed/ 0