Hello, and welcome to Software Carpentry at McMaster University on Monday, May 5.
Please see http://gdevenyi-05-05-mcmaster.github.io/2014-05-05-mcmaster for our home page.
Why Crunch Mode Doesn't Work: http://legacy.igda.org/why-crunch-modes-doesnt-work-six-lessons
http://www.amazon.com/Unix-Linux-Visual-QuickStart-Edition/dp/0321636783
http://www.amazon.com/Practical-Computing-Biologists-Steven-Haddock/dp/0878933913/
Why do many programming languages count from 0? http://exple.tive.org/blarg/2013/10/22/citation-needed/
You may also enjoy http://neverworkintheory.org/2014/01/29/stefik-siebert-syntax.html and/or http://www.amazon.com/Bad-Data-Handbook-Cleaning-Back/dp/1449321887/
12 Steps to Navier Stokes
http://lorenabarba.com/blog/cfd-python-12-steps-to-navier-stokes/
Please go to http://regexpal.com/
Put this data in the bottom box
Baker 1 2009-11-17 1223.0
Baker 1 2010-06-24 1122.7
Baker 2 2009-07-24 2819.0
Baker 2 2010-08-25 2971.6
Site/Date/Evil
Davison/May 22, 2010/1721.3
Davison/May 23, 2010/1724.7
Pertwee/May 24, 2010/2103.8
Davison/June 19, 2010/1731.9
Davison/July 6, 2010/2010.7
Who's Here
- Greg Wilson (Mozilla)
- Gabriel Devenyi (McMaster)
- Michael - (Physics)
- Jasper - Physics
- Zach - Physics
- Hurmiz (Physics)
- Rui - Biophysics
- Alan Morningstar (Physics, McMaster)
- Alex - P&A
- Gordon (Physics)
- Phil (Electrical Eng)
- Damien - P&A
- Ben - P&A
- Kaz- P&A
- Jon -P&A
- Sam - P&A
- Anna (Physics)
- Gandhali - P&A
- Alex - Physics
- Tara Parkin - Physics & Astronomy
- Peter - Physics & Astronomy
- Markus Rose (P & A)
- Jen Hyde (Matls Eng)
- Steffi Woo (Materials Eng)
- Hadi -Electrical Eng
- Izaak Lea (Engineering Physics)
- Fei Yang (Materials Eng)
- Aaron J. Maxwell (Astrophysics)
- Ray Ng (P&A)
- Rory Woods (P&A)
Notes on Human Memory
- 7 +/- 2 maximum short term items in memory
Human attention span
- Maximum attention span 45-90 minutes
- Turn off the phone = increased productivity
You should build a program in stages and pieces of jobs that do small things really well. This makes your program modular, easier to read, and gives you the ability to write small programs that are applicable to a number of different jobs.
You should always document your programs. Short explanations are better because the in depth explanation can be had by reading the code itself.
Excel is unadvisable because it doesn't remember processes
Notes on the Unix Shell
- Something
- Something else
- Yup, you know how this works
- ls -t := time ordered directory
- ls stands for listing. lists the contents of a directory. can modify by adding flags using a - (minus) plus the character. ex. ls -t means time ordered listing.
- well behaved software should give info when it is required. ex. mkdir makes a new directory but does not tell you it has made one. however trying to remake a currently existing directory gives an error message.
- wc is a word count command. output is #lines #words #bytes. -l tells how many lines are in the file.
- ![command number] executes a command
- "history" for a list of previously entered commands, with command numbers
- 'history -c' clears your history
- "cat" - shows what's in the file
- cut -d , # cur takes things, -d is delimiter, and # is which column of data you want
- ...>filename sends screen output to file
- cut -d [delimiter] file.txt where [delimiter] can be ", -f 1" to list first column data
- grep is global regular expressions, grep -v is inverted grep so it searches for every line without the listed stuff instead of every line with it
- sort file sorts file alphabetically
- cut -d [delimiter] file1.txt > file2.txt : outputs delimited data into file2
- uniq : gets rid of adjacent duplicates because it only wants to remember one line of data at a time, sort data first so duplicates are in a particular order to work properly
- grep [search] file.txt find in file; grep -v inverts, and lists lines that don't match
- grep -v [word] pulls all words that aren't [word]
- We cut, sorted, spit out unique values, and removed data we didn't want in four steps. Bash allows us to write pipelines using the | (shift+backslash) to pipe output of one command to another command, just like how we used > to pipe output from a command into a text file.
- uniq -C shows a count of how many times a uniq rep occurs
- Make use of Tab Completion!!
- when using bash scripting $n$ inserts the nth file passed to the function on the command line
- question mark matches one character ie. matches fish? to fish2 but not to fish
- Put commands constantly reused into a separate shell in a text editor: file.sh, call it up as bash file.sh
- ^D ends input in unix (for, say, you're inputting things in the terminal)
- Control-C tries to shut down an incomplete command
- put $1 in place of filename to input after file.sh, need to give filename so it doesn't just read from upstream
- Within modified matching: * matches any number of characters, ? matches one character
- mv moves or renames files: mv file1.txt file2.txt
- cp copies files
- For loops: for x in file.txt; do action $x; done, action such as bash file.sh, where file.sh has a $1 in place to get a filename
- Regular Expressions - a general purpose tool for searching for patterns in files. Very robust and powerful. Many programs use the RegEx tools, such as bash etc. (Play Regex Golf:http://regex.alf.nu/)
- When writing a data file, be mindful for what will go into reading it later on!
- ^R searches the command history
- !! repeats the previous command, !$ repeats the previous arguments.
Notes on Python and the IPython Notebook
- Lorena A Barba: CFD Python notebooks
- assert [condition], where condition must be satisfied to not abort with an assertion error, such as x > 0, for example
Let's Write a Gabriel
def gabriel(numbers):
return [] # this is probably wrong
# You may like this:
result = 0
for n in [1, 2, 3]:
result = result + n
# final value of result is 1 + 2 + 3
# and this
result = [] # an empty list
for n in [1, 2, 3]:
temp = n * n # calculate the square
result.append(temp) # put it on the end of the result list
# result is now [1, 4, 9]
def gabriel(nums):
sum = []
last = 0
total = 0
for n in nums:
if last > n:
total = 0
total += n
sum.append(total)
last = n
return sum
def gabriel(lst):
result = []
if lst == []:
return result
result.append(lst[0])
for i in range(1, len(lst)):
res = result[i-1]
prev = lst[i-1]
current = lst[i]
if current >= prev:
result.append(current+res)
else:
result.append(current)
return result
assert gabriel([]) == [], 'empty list'
assert gabriel([1]) == [1], 'single value'
assert gabriel([1, 2]) == [1, 3], 'increasing sequence'
assert gabriel([1, 2, 3]) == [1, 3, 6], 'is this useful?'
assert gabriel([1, 2, 0, 3]) == [1, 3, 0, 3], 'contains one drop'
assert gabriel([5, 6, 2, 3, 4]) == [5, 11, 2, 5, 9], 'contains a non-zero drop'
assert gabriel([1, 2, 2, 3]) == [1, 3, 5, 8], 'flatlines'
assert gabriel([0, -3, -2, -1]) == [0, -3, -5, -6]
print 'holy cow, Phil is a coding god!'
Let's Look at Databases
select distinct visited.dated, site.lat, site.long from visited join site on site.name = visited.site where visited.dated is not null;
-- Get unique date/lat/long for all visits.
select distinct visited.dated, site.lat, site.long
from visited join site
on site.name=visited.site
where visited.dated is not null;
Version Control and Git
http://www.github.com -- Please signup for an account if you don't have one
https://github.com/gdevenyi/mcmaster.latex
Well done everyone: http://i.imgur.com/59KTQ.gif