CS

Iterative v. Declarative Functions

It’s widely accepted that declarative programming is “good” or “better” than imperative programming, but what do the terms even mean? It’s not enough to have a vague idea that declarative programming is easy to reason about - that certainly wasn’t enough for me in trying to understand the difference between the two.

Imperative programming is explained as *how* something is done, whereas a declarative approach deals with *what* is going to be done.

Before we dig further into this, it’s important to realize that declarative approaches are basically an abstraction over an imperative implementation. Think about it - if you’re ordering a drop coffee at a cafe, you can tell the barista “I’d like a pour over coffee”. That’s a declarative approach. If declarative approaches could exist by themselves, the barista would be able to turn to a paper filter and say to it “make a coffee”. That’s not possible. The barista NEEDS to know *how* to make the coffee - that’s the imperative implementation underneath our declarative coffee order. The *what* must be backed by a *how* at some point down the line.

Enough with the fluff - let’s look at some a concrete example in code. In Swift - write an imperative function that accepts an array as a parameter and returns a new array containing all of the previous array elements minus 3.

func subtractThree (fromArray array:Array<Int>) -> Array<Int> {
    var returnArray: [Int] = []
    for item in array {
        returnArray.append(item-3)
    }
    return returnArray
}
  

The above code is difficult to understand at first sight compared to a more declarative approach. We can easily find the operation being performed on each array element because we’re we’ll versed with the general structure of a for loop. However, the code still tells us how we loop over the array instead of directly conveying what we want to achieve.

Loops can be rewritten declaratively by using higher order array functions (map, flatMap, etc). Written declaratively, the above would look like this:

func declarativeSubtractThree(fromArray array: Array<Int>) -> Array<Int> {
    let returnArray = array.map{$0-3}
    return returnArray
}

We’re expressing the logic of an operation without having to describe the control flow involved - this is declarative programming.

 

Programming Languages - a brief overview

Language Differences

There are a huge number of languages used by programmers today - some people may wonder "why do we need so many?" There is ONE single language understood by CPUs everywhere - machine code - however, writing machine code is tedious, unreadable by anyone else, and overall just impossible to write. Different machine code must be written for different CPUs - the languages we use on a daily basis bridge the gap between machine code and humans. Some langauges, such as Assembly Language or C, are low-level, meaning they are closer to machine code than others. Generally speaking, the closer to machine code a language is, the more you need to know about hardware. These low-level languages are more optimized for the CPU, although optimization of high-level languages is becoming less of an issue, according to some. 

Programming languages often learn from others and just add on newer features. Some are functional, some are object-oriented, and some are procedural. Some support variations of the three. Some are dynamically typed, while some are statically typed. Some support concurrency and multithreading, and some don't. Some, like Erlang, take the concept of concurrency and attempt to implement it without threads. To put it succinctly, the reason we have so many languages is because they all have their own use cases, and it is up to us to decide which language will best suit our needs. 

Compiled v Interpreted Languages

Compiled languages use a compiler to convert the source code into machine code - that way, the code is packaged up as machine code and can be sent to a target CPU. The compiled file is called an executable/executive file. Pros are that source code remains private, it often runs faster because code has been pre-converted and optimized for a specific target CPU, and it is ready to run as soon as the target machine gets it. Cons are that it isn’t cross-platform (compiled for a certain CPU), it’s not flexible and also compilation is an extra step. 

Interpreted languages don’t use a compiler - source code is sent to the target machine where it is then interpreted into machine code. The target machine processes it on the fly, processing it line by line. It doesn’t save it as a separate file (like an executable file). Pros are that they’re cross-platform, easier to test (since you directly run the source code without a compilation step), and it’s easier to debug since you have access to source code. Cons are that an interpreter is required, processing it is often slower since it isn’t precompiled, and source code is public. 

Both can be used - we can compile source code to an “intermediate language” which is converting it to machine code as far as we can take it while maintaining it’s ability to be cross-platform. We then send this to target machines, and those machines finish the compilation (also known as JIT or just in time compilation) - the intermediate language used is also referred to as “byte code”.

Scripting v non-Scripting Languages

Scripting languages are often interpreted languages; specifically, they are languages that must be embedded within other programs to run. Javascript is an example of a scripting language - it runs inside of a web browser. 

Scripting languages are used alongside non-scripting languages in large systems - because they're largely interpreted, it is useful to code small parts of the system that may be subject to frequent change. Those parts can be optimized for flexibility since they don't need to be recompiled after changes are made. Compiled parts of a system that are optimized for speed of execution are often written using compiled (non-scripting, in this case) languages. 

Scripting languages are often used for rapid prototyping (they are faster to code with since they tend to be high-level and are flexible), data wrangling and general experimentation. 

An example of a scripting language also being an interpreted language is in the command line - the commands we use are written in a shell scripting language, which is then interpreted, line by line, by the (in the case of a standard Mac terminal) bash interpreter. 

Network Communication

It's important to understand how machines establish networks to communicate with one another. Here, we'll briefly discuss a general overview of how connections are created, and discuss them in the context of the HTTP protocol.

Application Layer

With HTTP being used as our example, a network protocol defines rules and conventions for communication between network devices. There can be many layers and parts to a protocol - the application layer, IP layer, TCP, UDP and sockets, to name a few. With the application layer using HTTP as our example, our browser (the Application Layer Program here) sends a “letter” or a request to a server based on a URL (Uniform Resource Locator) typed in. It may say “dear server, please send me this page that I’m requesting.” This is application to application communication - the browser may send specifications in its letter such as “give me some cookies info, like the login info we exchanged last time. Also, you can compress the files in this particular format, I can accept HTML or text…”. The server can then customize a response. The browser passes off the request to the operating system, which connects to the server. The OS makes a holding area called a “socket” where it places the server response upon receipt. The browser can look to the socket and read the response just like a file. 

IP Layer

For the OS to send off the browser request, it has to look up the IP address for the website using DNS. An IP address is how requests find their way to the right place - there are many requests using the same network lines at the same time. An IP packet is like a letter with an address and a return address. Once an OS has the IP address for the website, it puts the browser HTTP request in an IP packet to send it off - it is possible that the request is too big to fit in one packet, and in that case, it goes into several packets. 

Packets get sent to “sorting stations” called routers. These routers look at destination addresses and try to get the request there. Sometimes they may not know how to get there, but know that other routers close to the destination may have a better idea of how to get there. Routers can break down, they can get overflowed with requests, they can run out of space…when that happens, it starts throwing away packets without telling anyone. We need some kind of delivery confirmation system, then; we do this in the form of the TCP layer. We wrap our HTTP request in a TCP layer before putting it into an IP packet. 

Transport Layer

TCP stands for transfer control protocol - first, the client and the server (our OS and a website server, in this case) establish a connection. Then they send and receive packets to each other - as one side receives a packet, it sends a confirmation message saying “I’ve received 5/5 IP packets so far”. If the request sending party doesn’t receive confirmation after a period of time, it assumes the request/packets were lost and resends them. 

Confirming receipt has a big drawback - it slows things up. UDP (user datagram protocol) is sometimes used instead of TCP. TPC is connection-oriented - once a connection is established, data can be sent in either direction. UDP is connectionless, where multiple messages are sent as packets in chunks. One program sends a group of IP packets to another program and that is the end of the relationship. TCP is thus more suited to systems that need high reliability, and where transmission time is less important - UDP, on the other hand, is suitable for applications that need faster transmission (such as games). Since UDP is also stateless, it is useful for servers that answer small queries in extremely high volume. HTTP, in our case, uses TCP. UDP doesn’t order packets as they are considered independent of one another - TCP arranges data packets in the order specified. 

Both TCP and UDP add a piece of important information to our network request: a port number. IP protocols only contain addresses that specify computers, not the applications running on those computers. Our HTTP request may contain an IP packet that knows the address of the machine running the website server, but there could be other servers running on that machine. The port number included in the network request tells the target OS which program to send the request to (port 80 is often used to mean a request for a web server). The same goes in reverse, where the server response will contain a port number that the client OS will use to direct the response to the proper program. This number is based on the socket that the client OS originally set up.

There are four things that uniquely identify a socket - Destination IP, Destination Port, Source IP, Source Port. You can have multiple sockets open to the same Destination IP and Port, as long as they have a different Source Port. Firefox tab 3 and Firefox tab 4 would have different Source Ports, so you could load the exact same website on both tabs. 

The general flow of things can summarized as such: Application Layer Request (HTTP in our case) -> Wrapped in the TCP/UDP layer which includes port number -> Wrapped in IP Packet -> Client OS sends request to target machine and Application Layer Program -> Server OS uses Port to send request to proper Application Layer Program -> Sends response using Source IP and Port -> Client OS receives response and stores in a socket, which the client Application Layer Program can read from. 

 

***A URL contains information such as the protocol type, host, port, resource path, and query.

(protocol)http://(host)www.amazon.com(port)80/(resourcepath)store/electronics/resource(query)?a=by8v&x

 

A Note on Server Responses - 

Request responses contain status codes that tell the client how to interpret the response:

1xx contain informational messages and are purely provisional.

2xx contain success messages (202 = accepted, 204 = no content, 205 = reset content, 206 = partial content in response)

3xx contain redirection messages (301 = moved permanently aka. resource is at a different URL, 303 = see other aka resource is temporarily at a different URL, 304 = not modified aka server says resource hasn’t changed and the client should use a cached copy)

4xx contain client error messages (we all know 404 not found, but there is also 400 = bad request, 401 = unauthorized, 403 = forbidden)

5xx contain server error messages

There is plenty more to learn just about the HTTP protocol, so feel free to continue reading up on it!