Thursday | 21 NOV 2024
[ previous ]
[ next ]

Uploading Images to my Blog

Title:
Date: 2023-10-25
Tags:  

Table of Contents

  1. Uploading Files
  2. The Code

I recently went on a trip and also watched Sonny Boy. Both of these things I had taken pictures of that I wanted to post on my blog but I currently don't have that function.

There are a few different issues so I'll outline the main ones. The first issue is that Pick doesn't like binary data. It uses attribute marks and segment marks to mean specific things and binary data considers those characters valid. This means that I could upload an image with a segment mark and Pick will see it as 2 different things. This means that I would need to somehow convert the binary data into ascii and then I could safely handly the binary data. This problem isn't too bad as it can be dealt with by converting binary data into hex. This does impose a cost as you now have to store the hex data which will be twice as big. You also now need to convert the data anytime you want the raw binary data.

The second major issue was that I am using my own custom web server, SERAPHIM, and so I need to now add support to handle file uploads. This is a problem of my own making.

Once I solved the first issue of how to store binary data in Pick, I got to the step of updating my web server. This is what the rest of my post will be about.

Uploading Files

I love HTTP because the logic of things falls out of plaintext. I didn't read much to figure out how file uploads work because for the most part you can start at the intuitive place and slowly build outwards. Who knows if this is actually good but I'm lucky to be making something for myself so I have room to experiment.

The first step is to create a form with a file input and then other end have my server dump the request directly to the screen. This way I can figure out how the data is getting sent. This was relatively straightforward. The hardest part might be creating a small file that you can dump to the screen without a whole lot of junk.

To generate the data, I used yes and truncate:

yes > file.txt
truncate -s 25 file.txt

yes will generate a stupid file so quickly kill it and then use truncate to snip the file. This will create a 25 byte file that you can dump to the screen. You can also manually create a small text file to test with.

The form element in html will look something like this, the key thing is to note that the type is the enctype attribute.

<form action="" method="POST" enctype="multipart/form-data">
   <input type="file" name="files" multiple>
   <button>submit</button>
</form>

Now the interesting thing happens when the form is submitted after the file is selected. Once the request is on the screen there will be a couple of things that will make it obvious what is going on.

The first thing is that the content type is now different from a regular form, the multipart bit from the enctype will be part of the content type. It will also have something called the boundary and this will be a random number.

The body of the request will have the data of the file in it marked out with the boundary.

HTTP makes file uploads quite simple now. The data simply gets pumped into the request with delimiters that the server can then parse out. This is sounding very much like how Pick works! I hope this is as entertaining to you as it is to me.

I would have loved to have changed the boundary to attribute marks and used Pick properly but unfortunately binary data has special characters. The solution I decided on as I was keen on keeping it still as Pick as possible was to escape the special characters out and then change the data to a multivalue string.

I then was able to get the entire file by using a regular Pick extraction and with that we have the file!

The next step was to change the special characters back to what they really are and then I can convert the entire thing to hex. With that my web server can handle file uploads and expose them via the request item that I build. This then gets passed to the function that gets run for a specific endpoint.

Multiple files works very much the same way, each file is split by the boundary and the reason it's a random number is to ensure that it doesn't appear in the file data itself. Ingenius. I wish this was something that Pick had as well though at the time I bet that the special characters were chosen because they would never be in real data.

The logic to get file uploads working is simple but unforunately it wasn't as simple to do. There was a few weird things in how Pick interacted with binary data and there are still issues with how file data is read from the socket in ScarletDME so for now my blog is still lacking an image upload. Hopefully once I figure out why there are socket issues then images will be uploading properly.

However my code works perfectly in UniVerse so I'm pretty happy with that.

The Code

Below is the code, there are comments and this is one of the few times where I have comments in the code. I believe that if you have comments in the code then something isn't simple enough yet and that the code could probably be written better. I'll need to revisit this at some point, possibly in a full rewrite.

   END ELSE IF CONTENT.TYPE[1,19] = 'multipart/form-data' THEN
      BOUNDARY = TRIM(CONTENT.TYPE[INDEX(CONTENT.TYPE,'=',1)+1,9999])
*
* BOUNDARY IS PREPENDED WITH --
*
      BOUNDARY = '--' : BOUNDARY
*
* ESCAPE OUT CHARS SO WE CAN USE @AM, 255 BREAKS THIS IS THE SEGMENT MARK
*
      REQUEST.RAW.BODY = CHANGE(REQUEST.RAW.BODY,@AM,'<ESC>254<ESC>')
      REQUEST.RAW.BODY = CHANGE(REQUEST.RAW.BODY,CHAR(255),'<ESC>255<ESC>')
      REQUEST.RAW.BODY = CHANGE(REQUEST.RAW.BODY,BOUNDARY,@AM)
*
      NUMBER.OF.BINS = DCOUNT(REQUEST.RAW.BODY,@AM)
*
      FOR BIN.CTR = 1 TO NUMBER.OF.BINS
         IF BIN.CTR = 1 OR BIN.CTR = NUMBER.OF.BINS THEN
            CONTINUE
         END
*
         BIN.DATA = REQUEST.RAW.BODY<BIN.CTR>
*
* REMOVE TRAILING CR:LF
*
         BIN.DATA = BIN.DATA[1,LEN(BIN.DATA)-2]
*
* REMOVE LEADING CR:LF
*
         BIN.DATA = BIN.DATA[3,LEN(BIN.DATA)-2]
*
* FIND \r\n\r\n TO FIND DATA SECTION
*
         BIN.POS = INDEX(BIN.DATA,CR:LF:CR:LF,1)
*
* HEADER SECTION IS UP TO THE \r\n\r\n
*
         BIN.HEADER = BIN.DATA[1,BIN.POS]
*
* DATA SECTION IS EVERYTHING AFTERWARDS, SUBTRACT 3 TO LEAVE A \n AT THE END
*
         BIN.DATA = BIN.DATA[BIN.POS+4,LEN(BIN.DATA)-LEN(BIN.HEADER)-3]
*
* UNESCAPE CHAR(254) AND CHAR(255)
*
         BIN.DATA = CHANGE(BIN.DATA,'<ESC>254<ESC>',@AM)
         BIN.DATA = CHANGE(BIN.DATA,'<ESC>255<ESC>',CHAR(255))
*
* CONVERT THE BINARY DATA INTO HEX SO WE CAN PASS AROUND
*
         BIN.DATA = OCONV(BIN.DATA,'MX0C')
*
* GET THE NAME OF THE UPLOADED FILE
*
         PATTERN = \0X'filename="'0X'"'0X\
         BIN.NAME = MATCHFIELD(BIN.HEADER,PATTERN,3)
*
         REQUEST.FILES<1,-1> = BIN.NAME
         REQUEST.FILES<2,-1> = BIN.DATA
      NEXT BIN.CTR
*
   END ELSE