Introduction to (sane) programming

Mariano Bernaldo <m.bernaldo.de.quiros@umcg.nl>

  • Tools

  • Software engineering

  • Clean code

  • Working with notebooks

Tools

Tools >>>> Process

  • Repetitive work is boring…​

  • …​and prone to errors

  • Why should I work, when I can make my computer do it for me?

nap

Version control systems

  • Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later.

Why?

  • Revert files back to a previous state

  • "Freeze" important versions of a document

  • Compare changes over time

  • Track progress of a project

  • See who modified something, and when

Modern version control systems

  • Remote backup of files

  • Powerful tool for collaboration

GIT

  • Currently the most used

  • Free and open source

  • Distributed

  • Powerful and flexible

  • Learning curve can be steep

Installation

Package managers are heavily recommended! anaconda, chocolatey, homebrew…​

but, what is it?

architecture

How to use GIT

workflow

git workflow
You WILL regret not using version control

Other tools

  • Integrated Development Environments

  • Debuggers

  • Libraries

  • Reference sites: stack overflow, rosettacode, kaggle…​

github stack overflow y code my code 44625886

Scripting languages

Do it once, do it right

(and never do it again)

geeks and repetitive tasks

Automate coversmall 0

Software engineering

dam
dam building

the development cycle

waterfall

it didn’t work…​

building software
  • Software is complex!

  • Requirements are often fuzzy

  • Cost of changes is low

  • Every project is a new project*

The agile manifesto (2001)

  • adaptive planning

  • evolutionary development

  • early delivery

  • continuous improvement

  • flexible responses to change.

Waterfall Vs Agile

advantages

advantages

industry is learning

spaceX

what does this mean for you?

  • Get a working prototype ASAP

  • Keep adding features and improving from there

  • Communicate! if possible, ask for input every (few) iterations

Fast and dirty?

1 SS50ADXE37izs9i4 myfZg

Clean code

clean code

what is good code?

wtfm

good code

  • It works

  • Is easy to understand

  • Is easy to change

Bad code: code smells

  • Duplicated code

  • Unnecessary complexity

  • A single change needs to be applied in many places at the same time.

  • Methods do too many things

  • Too many nested if / loops

  • Too many parameters

  • …​

How to improve code quality?

Refactoring is the process of changing a software system in such a way that it does not alter the function of the code yet improves its internal structure.
Refactoring: Improving the Design of Existing Code (1999)
— Martin Fowler

KISS

Keep it Simple, Stupid

Avoid code duplication

dont repeat yourself
  • Probably the most common and worst mistake in programming

  • Avoid at all cost!

refactoring code duplication

def f1():
  # read file
  ...
  # use complex method to calculate a

def f2():
  # read file
  ...
  # use complex method to calculate b
def read_file()
def complex_method()

def f1():
  read_file()
  complex_method(a)

def f2():
  read_file()
  complex_method(b)
def f1():
  # read file
  ...
  # calculate a

def f2():
  # read file
  ...
  # calculate b

def higher_function()
    if (condition)
        f1()
    else
        f2()
def read_file()
def complex_method()

def higher_function()
    read_file()
    if (condition)
        complex_method(a)
    else
        complex_method(b)

A special case

def f1():
  # same code
  ...
  # method 1
  ...
  # same code

def f2():
  # same code
  ...
  # method 2
  ...
  # same code
def method_1():
    ...
def method_2():
    ...

#functions are just another type of objects!
def f(in_method):
  # same code
  ...
  in_method()
  ...
  # same code

def higher_function():
    f(method_1)
    f(method_2)

Each function should do one thing

“The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that”

size is not everything!

def function_1():
  # code here
  return function_2(results_1)

def function_2(input_2):
  # code here
  return function_3(results_2)

def function_3(input_3):
  # code here
  return function_4(results_3)

def function_4(input_4):
  # code here
  return results_4
def main_function():
  results_1 = function_1()
  results_2 = function_2(results_1)
  results_3 = function_3(results_2)
  return function_4(results_3)

Use comments

Proper use of commenting can

  • make code maintenance much easier

  • help finding bugs faster

  • make your code more readable to other people

  • make your code more readable to yourself (in six months)

Don’t over-comment your code

  • comments are a necessary evil

  • comments cover up naming failures

  • comments must evolve with code

when and how to write comments

  • Always try to explain yourself in code.

  • Don’t be redundant.

  • Use them in complex expressions.

  • Use as explanation of intent.

  • Use as clarification of code.

  • Use as warning of consequences.

  • Don’t comment out code: remove it (and use version control)

Naming Conventions

  • Use names that are easy to understand.

  • Format them consistently.

  • Names must help understanding what a piece of code does.

  • Avoid single variable names.

example: replacing comments with good naming

# This function calculates prices, compares to sales
# promotions, checks if prices are valid,
# then send an email of promotion to user
def doSomeThings():

  # Calculate prices
  ...
  ...
  # Compare prices with sales promotions
  ...
  ...
  # Check if calculated prices are valid
  ...
  ...
  # Send promotions to users
  ...
  ...
def sendPromotionEmailToUsers():
  calculatePrices();
  comparePricesWithSalesPromotions();
  checkIfCalculatedPricesAreValid();
  sendPromotionEmail();

keep a consistent format

  • useCamelCase

  • OrPascalCase

  • or_snake_case

  • or-kebab-case

  • But not all of them together

def my_function():
  print("Hello in a function")

def myFunction():
  print("Ey, I'm a different function")

def my-Function():
  print("I'm yet another function, good luck choosing the right one")

def my_FunctionIs-A-Terrible_mess():
  print("Imagine how things turn when you have several thousand lines of code...")

Some languages (ex. python) have a "standard" pep 08 format, check it out!

Working with notebooks

Notebooks are a powerful tool for scientific programming

  • Lets you keep a detailed record of your work

  • Everything is in one place

  • Allows interactive code and data exploration

  • Easy to share

how to work with notebooks?

  • Plot everything

  • Write detailed explanations of what you do, and why

    • Unless you have a very specific kind of reader, explain also "well known" methods

    • Btw, learn Markdown

  • Keep your notebooks reasonably short

  • Create different notebooks for different aims

  • Refactoring usually means taking the code and putting it auxiliary files

Guided example: Machine learning workflow

xl23h24nc