ongoing methodological crisis
the results of many scientific studies are hard or impossible to reproduce.
empirical reproductions are essential for the the scientific method
Most techniques come from the DevOps world!
Create a reusable artifact
Increased collaboration
Shared responsibility
Autonomous teams
Focus on the process, not just the product
Risk management
Resource exploitation
Principles inspire practices
Practices require tools
Did you ever need to roll back some project or assignment to a previous version?
How did you track the history of the project?
Inefficient!
Did you ever need to develop some project or assignment as a team?
How did you organize the work to maximize the productivity?
Tools meant to support the development of projects by:
Distributed: Every copy of the repository contains (i.e., every developer locally have) the entire history.
Centralized: A reference copy of the repository contains the whole history; developers work on a subset of such history
Git is now the dominant DVCS (although Mercurial is still in use, e.g., for Python, Java, Facebook).
At a first glance, the history of a project looks like a line.
Anything that can go wrong will go wrong
$1^{st}$ Murphy’s law
If anything simply cannot go wrong, it will anyway $5^{th}$ Murphy’s law
Go back in time to a previous state where things work
Then fix the mistake
If you consider rollbacks, history is a tree!
Alice and Bob work together for some time, then they go home and work separately, in parallel
They have a diverging history!
If you have the possibility to reconcile diverging developments, the history becomes a graph!
Reconciling diverging developments is usually referred to as merge
Project meta-data. Includes the whole project history
Usually, stored in a hidden folder in the root folder of the project
(or worktree, or working directory)
the collection of files (usually, inside a root folder) that constitute the project, excluding the meta-data.
A saved status of the project.
A named sequence of commits
If no branch has been created at the first commit, a default name is used.
To be able to go back in time or change branch, we need to refer to commits *
tree-ish
esAppending ~
and a number i
to a valid tree-ish means “i-th
parent of this tree-ish”
The operation of moving to another commit
Moves the HEAD
to the specified target tree-ish
Let us try to see what happens when ve develop some project, step by step.
Oh, no, there was a mistake! We need to roll back!
6
whenever we want to.5
, I’d like to have it into new-branch
Notice that:
8
is a merge commit, as it has two parents: 7
and 5
De-facto reference distributed version control system
¹ Less difference now, Facebook vastly improved Mercurial
Git is a command line tool
Although graphical interfaces exsist, it makes no sense to learn a GUI:
I am assuming minimal knowledge of the shell, please let me know NOW if you’ve never seen it
Configuration in Git happens at two level
Set up the global options reasonably, then override them at the repository level, if needed.
git config
The config
subcommand sets the configuration options
--global
option, configures the tool globallygit config [--global] category.option value
option
of category
to value
As said, --global
can be omitted to override the global settings locally
user.name
and user.email
A name and a contact are always saved as metadata, so they need to be set up
git config --global user.name "Your Real Name"
git config --global user.email "your.email.address@your.provider"
Some operations pop up a text editor.
It is convenient to set it to a tool that you know how to use
(to prevent, e.g., being “locked” inside vi
or vim
).
Any editor that you can invoke from the terminal works.
git config --global core.editor nano
How to name the default branch.
Two reasonable choices are main
and master
git config --global init.defaultbranch master
git init
.git
folder.git
folder marks the root of the repository
cd
to locate yourself inside the folder that contains (or will containe the project)
mkdir
)git init
.git
folder.Git has the concept of stage (or index).
git add <files>
moves the current state of the files into the stage as changesgit reset <files>
removes currently staged changes of the files from stagegit commit
creates a new changeset with the contents of the stageIt is extremely important to understand clearly what the current state of affairs is
git status
prints the current state of the repository, example output:
❯ git status
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: content/_index.md
new file: content/dvcs-basics/_index.md
new file: content/dvcs-basics/staging.png
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: layouts/shortcodes/gravizo.html
modified: layouts/shortcodes/today.html
git config --global user.name 'Your Real Name'
git config --global user.email 'your@email.com'
git config user.name 'Your Real Name'
git config user.email 'your@email.com'
-m
, otherwise Git will pop up the default editor
git commit -m 'my very clear and explanatory message'
At the first commit, there is no branch and no HEAD
.
Depending on the version of Git, the following behavior may happen upon the first commit:
master
master
, but warns that it is a deprecated behavior
main
as seen as more inclusivegit config --global init.defaultbranch default-branch-name
In general, we do not want to track all the files in the repository folder:
Of course, we could just not add
them, but the error is around the corner!
It would be much better to just tell Git to ignore some files.
This is achieved through a special .gitignore
file.
.gitignore
, names like foo.gitignore
or gitignore.txt
won’t work
echo whatWeWantToIgnore >> .gitignore
(multiplatform command)git add
is called with the --force
option).gitignore
example# ignore the bin folder and all its contents
bin/
# ignore every pdf file
*.pdf
# rule exception (beginning with a !): pdf files named 'myImportantFile.pdf' should be tracked
!myImportantFile.pdf
Going to a new line is a two-phased operation:
In electromechanic teletypewriters (and in typewriters, too), they were two distinct operations:
Terminals were designed to behave like virtual teletypewriters
tty
LF
was sufficient in virtual TTYs to go to a new line
\n
means “newline”we would get
lines
like these
CR
character followed by an LF
character: \r\n
LF
character: \n
CR
character: \r
\n
If your team uses multiple OSs, it is likely that, by default, the text editors use either LF
(on Unix) or CRLF
It is also very likely that, upon saving, the whole file gets rewritten with the “locally correct” line endings
Git tries to tackle this issue by converting the line endings so that they match the initial line endings of the file,
resulting in repositories with illogically mixed line endings
(depending on who created a file first)
and loads of warnings about LF
/CRLF
conversions.
Line endings should instead be configured per file type!
.gitattributes
LF
everywhere, but for Windows scripts (bat
, cmd
, ps1
).gitattributes
file in the repository root
* text=auto eol=lf
*.[cC][mM][dD] text eol=crlf
*.[bB][aA][tT] text eol=crlf
*.[pP][sS]1 text eol=crlf
git add
adds a change to the stagegit add someDeletedFile
is a correct command, that will stage the fact that someDeletedFile
does not exist anymore, and its deletion must be registered at the next commit
.
foo
into bar
:
git add foo bar
foo
has been deleted and bar
has been createdOf course, it is useful to visualize the history of commits. Git provides a dedicated sub-command:
git log
HEAD
commit (the current commit) backwards
git log --oneline
git log --all
git log --graph
git log --oneline --all --graph
git log --oneline --all --graph
* d114802 (HEAD -> master, origin/master, origin/HEAD) moar contribution
| * edb658b (origin/renovate/gohugoio-hugo-0.94.x) ci(deps): update gohugoio/hugo action to v0.94.2
|/
* 4ce3431 ci(deps): update gohugoio/hugo action to v0.94.1
* 9efa88a ci(deps): update gohugoio/hugo action to v0.93.3
* bf32a8b begin with build slides
* b803a65 lesson 1 looks ready
* 6a85f8f ci(deps): update gohugoio/hugo action to v0.93.2
* b474d2a write more on the introductory lesson
* 8a7105e ci(deps): update gohugoio/hugo action to v0.93.1
* 6e40642 begin writing the first lesson
<tree-ish>
esIn git, a reference to a commit is called <tree-ish>
. Valid <tree-ish>
es are:
b82f7567961ba13b1794566dde97dda1e501cf88
.b82f7567
.HEAD
, a special name referring to the current commit (the head, indeed).It is possible to build relative references, e.g., “get me the commit before this <tree-ish>
”,
by following the commit <tree-ish>
with a tilde (~
) and with the number of parents to get to:
<tree-ish>~STEPS
where STEPS
is an integer number produces a reference to the STEPS-th
parent of the provided <tree-ish>
:
b82f7567~1
references the parent of commit b82f7567
.some_branch~2
refers to the parent of the parent of the last commit of branch some_branch
.HEAD~3
refers to the parent of the parent of the parent of the current commit.In case of merge commits (with multiple parents), ~
selects the first one
Selection of parents can be performed with caret in case of multiple parents (^
)
git rev-parse
reference on specifying revision is publicly availableWe want to see which differences a commit introduced, or what we modified in some files of the work tree
Git provides support to visualize the changes in terms of modified lines through git diff
:
git diff
shows the difference between the stage and the working tree
git add
git diff --staged
shows the difference between HEAD
and the working treegit diff <tree-ish>
shows the difference between <tree-ish>
and the working tree (stage excluded)git diff --staged <tree-ish>
shows the difference between <tree-ish>
and the working tree, including staged changesgit diff <from> <to>
, where <from>
and <to>
are <tree-ish>
es, shows the differences between <from>
and <to>
git diff
Example output:diff --git a/.github/workflows/build-and-deploy.yml b/.github/workflows/build-and-deploy.yml
index b492a8c..28302ff 100644
--- a/.github/workflows/build-and-deploy.yml
+++ b/.github/workflows/build-and-deploy.yml
@@ -28,7 +28,7 @@ jobs:
# Idea: the regex matcher of Renovate keeps this string up to date automatically
# The version is extracted and used to access the correct version of the scripts
USES=$(cat <<TRICK_RENOVATE
- - uses: gohugoio/hugo@v0.94.1
+ - uses: gohugoio/hugo@v0.93.3
TRICK_RENOVATE
)
echo "Scripts update line: \"$USES\""
The output is compatible with the Unix commands diff
and patch
Still, binary files are an issue! Tracking the right files is paramount.
Navigation of the history concretely means to move the head (in Git, HEAD
) to arbitrary points of the history
In Git, this is performed with the checkout
commit:
git checkout <tree-ish>
HEAD
to the provided <tree-ish>
<tree-ish>
The command can be used to selectively checkout a file from another revision:
git checkout <tree-ish> -- foo bar baz
foo
, bar
, and baz
from commit <tree-ish>
, and adds them to the stage (unless there are uncommitted changes that could be lost)--
is surrounded by whitespaces, it is not a --foo
option, it is just used as a separator between the <tree-ish>
and the list of files
<tree-ish>
and we need disambiguationGit does not allow multiple heads per branch
(other DVCS do, in particular Mercurial):
for a commit to be valid, HEAD
must be at the “end” of a branch (on its last commit), as follows:
When an old commit is checked out this condition doesn’t hold!
If we run git checkout HEAD~4
:
The system enters a special workmode called detached head.
When in detached head, Git allows to make commits, but they are lost!
(Not really, but to retrieve them we need git reflog
and git cherry-pick
, that we won’t discuss)
To be able to start new development lines, we need to create a branch.
In Git, branches work like movable labels:
HEAD
refers toHEAD
is attached to them, they move along with HEAD
Branches are created with git branch branch_name
⬇️ git branch new-experiment
⬇️
HEAD
does not attach to the new branch by default,
an explicit checkout
is required.
Creating new branches allows to store changes made when we are in DETACHED_HEAD state.
⬇️ git checkout HEAD~4
⬇️
➡️ Next: git branch new-experiment
➡️
⬇️ git branch new-experiment
⬇️
HEAD
is still detached though, we need to attach it to the new branch for it to store our commits
➡️ Next: git checkout new-experiment
➡️
⬇️ git checkout new-experiment
⬇️
⬇️ [changes] + git add
+ git commit
⬇️
$\Rightarrow$ HEAD
brings our branch forward with it!
As you can imagine, creating a new branch and attaching HEAD
to the freshly created branch is pretty common
As customary for common operations, a short-hand is provided: git checkout -b new-branch-name
new-branch-name
from the current position of HEAD
HEAD
to new-branch-name
⬇️ git checkout -b new-experiment
⬇️
Reunifying diverging development lines is much trickier than spawning new development lines
In other words, merging is much trickier than branching
In Git, git merge target
merges the branch named target
into the current branch (HEAD
must be attached)
⬇️ git merge master
⬇️
Consider this situation:
new-experiment
to also have the changes from C7
, to C10
(to be up to date with master
)master
contains all the commits of new-experiment
new-experiment
to point it to C6
Git tries to resolve most conflicts by itself
In case of conflict on one or more files, Git marks the subject files as conflicted, and modifies them adding merge markers:
<<<<<<< HEAD
Changes made on the branch that is being merged into,
this is the branch currently checked out (HEAD).
=======
Changes made on the branch that is being merged in.
>>>>>>> other-branch-name
git add
git commit
git commit --no-edit
can be used to use it without editingAvoiding merge conflicts is much better than solving them
Although they are unavoidable in some cases, they can be minimized by following a few good practices:
feature
into master
Branches work like special labels that move if a commit is performed when HEAD
is attached.
Also, the history tracked by git is a directed acyclic graph (each commit has a reference to its parents)
$\Rightarrow$ Branches can be removed without information loss, as far as there is at least another branch from which all the commits of the deleted branch are reachable
Safe branch deletion is performed with git branch -d branch-name
(fails if there is information loss).
⬇️ git branch -d fix/bug22
⬇️
No commit is lost, branch fix/bug22
is removed
What about git branch -d feat/serverless
?
It would fail with an error message, as 11
would be lost
git init
Git provides a clone
subcommand that copies the whole history of a repository locally
git clone URI destination
creates the folder destination
and clones the repository found at URI
destination
is not empty, failsdestination
is omitted, a folder with the same namen of the last segment of URI
is createdURI
can be remote or local, Git supports the file://
, https://
, and ssh
protocols
ssh
recommended when availableclone
subcommand checks out the remote branch where the HEAD
is attached (default branch)Examples:
git clone /some/repository/on/my/file/system destination
destination
and copies the repository from the local directorygit clone https://somewebsite.com/someRepository.git myfolder
myfolder
and copies the repository located at the specified URL
git clone user@sshserver.com:SomePath/SomeRepo.git
SomeRepo
and copies the repository located at the specified URL
init
, no remote is known.clone
, a remote called origin
is created automaticallyNon-local branches can be referenced as remoteName/branchName
The remote
subcommand is used to inspect and manage remotes:
git remote -v
lists the known remotes
git remote add a-remote URI
adds a new remote named a-remote
and pointing to URI
git remote show a-remote
displays extended information on a-remote
git remote remove a-remote
removes a-remote
(it does not delete information on the remote, it locally forgets that it exits)
Remote branches can be associated with local branches, with the intended meaning that the local and the remote branch are intended to be two copies of the same branch
git branch --set-upstream-to=remote/branchName
git branch --set-upstream-to=origin/develop
sets the current branch upstream to origin/develop
clone
, its default branch is checked out locally with the same name it has on the remote, and the remote branch is automatically set as upstreamgit clone git@somesite.com/repo.git
git@somesite.com/repo.git
is saved as origin
HEAD
is attached, in our case master
) on origin
gets checked out locally with the same namemaster
is set up to track origin/master
as upstreamgit branch
(or git checkout -b
) can checkout remote branches locally once they have been fetched.
➡️ git checkout -b imported-feat origin/feat/serverless
➡️
⬇️ git checkout -b imported-feat origin/feat/serverless
⬇️
imported-feat
is created locally, and origin/feat/serverless
is set as its upstreamgit checkout -b feat/new-client origin/feat/new-client
git checkout feat/new-client
feat/new-client
with the upstream branch set to origin/feat/new-client
if:
feat/new-client
➡️ Next: git clone git@somesite.com/repo.git
➡️
⬇️ git clone git@somesite.com/repo.git
⬇️
➡️ Next: git checkout -b feat/serverless origin/feat/serverless
➡️
⬇️ git checkout -b feat/serverless origin/feat/serverless
⬇️
➡️ Next: git remote add other git@somewhereelse.org/repo.git
➡️
⬇️ git remote add other git@somewhereelse.org/repo.git
⬇️
➡️ Next: git checkout -b other-master other/master
➡️
⬇️ git checkout -b other-master other/master
⬇️
You can operate with multiple remotes! Just remember: branch names must be unique for every repository
origin/master
and anotherRemote/master
, you need two local branches with diverse namesTo check if a remote has any update available, git provides th git fetch
subcommand.
git fetch a-remote
checks if a-remote
has any new information. If so, it downloads it.
git fetch
without a remote:
HEAD
is attached and the current branch has an upstream, then the remote that is hosting the upstream branch is fetchedorigin
is fetched, if presentmerge
The new information fetched includes new commits, branches, and tags.
➡️ Next: Changes happen on somesite.com/repo.git
and on our repository concurrently ➡️
⬇️ Changes happen on somesite.com/repo.git
and on our repository concurrently ⬇️
➡️ git fetch && git merge origin/master
(assuming no conflicts or conflicts resolved) ➡️
⬇️ git fetch && git merge origin/master
(assuming no conflicts or conflicts resolved) ⬇️
If there had been no updates locally, we would have experienced a fast-forward
git pull
Fetching the remote with the upstream branch and then merging is extremely common, so common that there is a special subcommand that operates.
git pull
is equivalent to git fetch && git merge FETCH_HEAD
git pull remote
is the same as git fetch remote && git merge FETCH_HEAD
git pull remote branch
is the same as git fetch remote && git merge remote/branch
git pull
is more commonly used than git fetch
+ git merge
,
still, it is important to understand that it is not a primitive operation
Git provides a way to send changes to a remote: git push remote branch
remote/branch
, and updates the remote HEAD
push
requires writing rights to the remote repositorypush
fails if the pushed branch is not a descendant of the destination branch, which means:
By default, git push
does not send tags
git push --tags
sends only the tagsgit push --follow-tags
sends commits and then tags➡️ Next: [some changes] git add . && git commit
➡️
⬇️ [some changes] git add . && git commit
⬇️
➡️ Next: git push
➡️
⬇️ git push
⬇️
origin/master
was a subset of master
HEAD
can be fast-forwarded➡️ Next: someone else pushes a change ➡️
⬇️ someone else pushes a change ⬇️
➡️ Next: [some changes] git add . && git commit
➡️
⬇️ [some changes] git add . && git commit
⬇️
➡️ Next: git push
➡️
⬇️ git push
⬇️
ERROR
To somesite.com/repo.git
! [rejected] master -> master (fetch first)
error: failed to push some refs to 'somesite.com/repo.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
master
is not a superset of origin/master
10
is in origin/master
but not in master
, preventing a remote fast-forward➡️ Next: git pull
➡️
⬇️ git pull
(assuming no merge conflicts, or after conflict resolution) ⬇️
master
is a superset of origin/master
! (all the commits in origin/master
, plus 11
and 12
)➡️ Next: git push
➡️
⬇️ git push
⬇️
The push suceeds now!
It is often handful to associate some commits with a symbolic name, most of the time to assign versions.
8d400c0
as version 1.2.3
Although in principle branches could be used to do so, their nature is of moving labels:
when HEAD
is attached, new commits move the branch forward.
We would like to have branches to which HEAD
cannot attach (hence, they can’t be moved from their creation point).
⬇️ git checkout C4 && git branch 1.2.3 && git checkout master
⬇️
Looks good, but if we do something like: ⬇️ git checkout 1.2.3
[some changes] git commit
⬇️
Our version moved, we never want this to happen!
The tag
subcommand to create permanent labels attached to commits.
Tags come in two fashions:
-a
) store additional information: a message, and, optionally, a signature (option -s
/-u
)➡️ git checkout C4 && git tag 1.2.3
➡️
⬇️ git checkout C4 && git tag 1.2.3
⬇️
HEAD
cannot attach to tags!
Tags are not pushed by default.
To push tags, use git push --tags
after a normal push.
Alternatively, use git push --follow-tags
to push both commits and tags.
Several services allow the creation of shared repositories on the cloud. They enrich the base git model with services built around the tool:
repositories are uniquely identified by an owner and a repository name
owner/repo
is a name unique to every repositorysupports two kind of authentications:
repo
access scope at https://github.com/settings/tokens/newhttps://github.com/owner/repo.git
becomes: https://token@github.com/owner/repo.git
Disclaimer: this is a “quick and dirty” way of generating and using SSH keys.
You are warmly recommended to learn how it works and the best security practices.
ssh-keygen
cat ~/.ssh/id_rsa.pub
ssh-rsa AAAAB3Nza<snip, a lot of seemingly random chars>PIl+qZfZ9+M= you@your_hostname
You are all set! Enjoy your secure authentication.
The practice of integrating code with a main development line continuously
Verifying that the build remains intact
Traditionally, protoduction is jargon for a prototype that ends up in production
Software that promotes CI practices should:
Plenty of integrators on the market
Circle CI, Travis CI, Werker, done.io, Codefresh, Codeship, Bitbucket Pipelines, GitHub Actions, GitLab CI/CD Pipelines, JetBrains TeamCity…
Naming and organization is variable across platforms, but in general:
In essence, designing a CI system is designing a software construction, verification, and delivery pipeline with the abstractions provided by the selected provider.
Configuration can grow complex, and is usually stored in a YAML file
(but there are exceptions, JetBrains TeamCity uses a Kotlin DSL).
Workflows are configured in YAML files located in the default branch of the repository in the .github/workflows
folder.
One configuration file $\Rightarrow$ one workflow
For security reasons, workflows may need to be manually activated in the Actions tab of the GitHub web interface.
Executors of GitHub actions are called runners: virtual machines (hosted by GitHub) with the GitHub Actions runner application installed.
Note: the GitHub Actions application is open source and can be installed locally, creating “self-hosted runners”. Self-hosted and GitHub-hosted runners can work together.
Upon their creation, runners have a default environment, which depends on their operating system
Several CI systems inherit the “convention over configuration principle.
For instance, by default (with an empty configuration file) Travis CI builds a Ruby project using rake
.
GitHub actions does not adhere to the principle: if left unconfigured, the runner does nothing (it does not even clone the repository locally).
Probable reason: Actions is an all-round repository automation system for GitHub, not just a “plain” CI/CD pipeline
$\Rightarrow$ It can react to many different events, not just changes to the git repository history
Minimal, simplified workflow structure:
# Mandatory workflow name
name: Workflow Name
on: # Events that trigger the workflow
jobs: # Jobs composing the workflow, each one will run on a different runner
Job-Name: # Every job must be named
# The type of runner executing the job, usually the OS
runs-on: runner-name
steps: # A list of commands, or "actions"
- # first step
- # second step
Another-Job: # This one runs in parallel with Job-Name
runs-on: '...'
steps: [ ... ]
We discussed that automation / integration pipelines are part of the software
YAML is often used by CI integrators as preferred configuration language as it enables some form of DRY:
&
/ *
)<<:
)hey: &ref
look: at
me: [ "I'm", 'dancing' ]
merged:
foo: *ref
<<: *ref
look: to
Same as:
hey: { look: at, me: [ "I'm", 'dancing' ] }
merged: { foo: { look: at, me: [ "I'm", 'dancing' ] }, look: to, me: [ "I'm", 'dancing' ] }
GHA’s YAML parser does not support standard YAML anchors and merge keys
(it is a well-known limit with an issue report open since ages)
GHA achieves reuse via:
Many actions are provided by GitHub directly, and many are developed by the community.
# This is a basic workflow to help you get started with Actions
name: Example workflow
# Controls when the workflow will run
on:
push:
tags: '*'
branches-ignore: # Pushes on these branches won't start a build
- 'autodelivery**'
- 'bump-**'
- 'renovate/**'
paths-ignore: # Pushes that change only these file won't start the workflow
- 'README.md'
- 'CHANGELOG.md'
- 'LICENSE'
pull_request:
branches: # Only pull requests based on these branches will start the workflow
- master
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
Default-Example:
# The type of runner that the job will run on
runs-on: macos-latest
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@d632683dd7b4114ad314bca15554477dd762a938
# Runs a single command using the runners shell
- name: Run a one-line script
run: echo Hello from a ${{ runner.os }} machine!
# Runs a set of commands using the runners shell
- name: Run a multi-line script
run: |
echo Add other actions to build,
echo test, and deploy your project.
Explore-GitHub-Actions:
runs-on: ubuntu-latest
steps:
- run: echo "🎉 The job was automatically triggered by a ${{ github.event_name }} event."
- run: echo "🐧 This job is now running on a ${{ runner.os }} server hosted by GitHub!"
- run: echo "🔎 The name of your branch is ${{ github.ref }} and your repository is ${{ github.repository }}."
- name: Check out repository code
uses: actions/checkout@v4
- run: echo "💡 The ${{ github.repository }} repository has been cloned to the runner."
- run: echo "🖥️ The workflow is now ready to test your code on the runner."
- name: List files in the repository
run: ls ${{ github.workspace }}
- run: echo "🍏 This job's status is ${{ job.status }}."
# Steps can be executed conditionally
- name: Skipped conditional step
if: runner.os == 'Windows'
run: echo this step won't run, it has been excluded!
- run: |
echo This is
echo a multi-line
echo script.
Conclusion:
runs-on: windows-latest
# Jobs may require other jobs
needs: [ Default-Example, Explore-GitHub-Actions ]
# Typically, steps that follow failed steps won't execute.
# However, this behavior can be changed by using the built-in function "always()"
if: always()
steps:
- name: Run something on powershell
run: echo By default, ${{ runner.os }} runners execute with powershell
- name: Run something on bash
shell: bash
run: echo However, it is allowed to force the shell type and there is a bash available for ${{ runner.os }} too.
GitHub Actions allows expressions to be included in the workflow file
${{ <expression> }}
if:
conditionals are automatically evaluated as expressions, so ${{ }}
is unnecessary
if: <expression>
works just fineThe language is rather limited, and documented at
https://docs.github.com/en/actions/learn-github-actions/expressions
The language performs a loose equality
When a string is required, any type is coerced to string
Type | Literal | Number coercion | String coercion |
---|---|---|---|
Null | null |
0 |
'' |
Boolean | true or false |
true : 1 , false : 0 |
'true' or 'false' |
String | '...' (mandatorily single quoted) |
Javascript’s parseInt , with the exception that '' is 0 |
none |
JSON Array | unavailable | NaN |
error |
JSON Object | unavailable | NaN |
error |
Arrays and objects exist and can be manipulated, but cannot be created
( )
[ ]
.
!
, and &&
, or ||
==
, !=
, <
, <=
, >
, >=
Functions cannot be defined. Some are built-in, their expressivity is limited. They are documented at
https://docs.github.com/en/actions/learn-github-actions/expressions#functions
success()
: true
if none of the previous steps failed
if: success()
conditionalalways()
: always true
, causes the step evaluation even if previous failed, but supports combinations
always() && <expression returning false>
evaluates the expression and does not run the stepcancelled()
: true
if the workflow execution has been canceledfailure()
: true
if a previous step of any previous job has failedThe expression can refer to some objects provided by the context. They are documented at
https://docs.github.com/en/actions/learn-github-actions/contexts
Some of the most useful are the following
github
: information on the workflow context
.event_name
: the event that triggered the workflow.repository
: repository name.ref
: branch or tag that triggered the workflow
refs/heads/<branch>
refs/tags/<tag>
env
: access to the environment variablessteps
: access to previous step information
.<step id>.outputs.<output name>
: information exchange between stepsrunner
:
.os
: the operating systemsecrets
: access to secret variables (in a moment…)matrix
: access to the build matrix variables (in a moment…)By default, GitHub actions’ runners do not check out the repository
It is a common and non-trivial operation (the checked out version must be the version originating the workflow), thus GitHub provides an action:
- name: Check out repository code
uses: actions/checkout@v4
Since actions typically do not need the entire history of the project, by default the action checks out only the commit that originated the workflow (--depth=1
when cloning)
Also, tags don’t get checked out
- name: Checkout with default token
uses: actions/checkout@v4.2.0
if: inputs.token == ''
with:
fetch-depth: 0
submodules: recursive
- name: Fetch tags
shell: bash
run: git fetch --tags -f
(code from a custom action, ignore the if
)
Communication with the runner happens via workflow commands
The simplest way to send commands is to print on standard output a message in the form:
::workflow-command parameter1={data},parameter2={data}::{command value}
In particular, actions can set outputs by printing:
::set-output name={name}::{value}
jobs:
Build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: danysk/action-checkout@0.2.20
- id: branch-name # Custom id
uses: tj-actions/branch-names@v8
- id: output-from-shell
run: ruby -e 'puts "dice=#{rand(1..6)}"' >> $GITHUB_OUTPUT
- run: |
echo "The dice roll resulted in number ${{ steps.output-from-shell.outputs.dice }}"
if ${{ steps.branch-name.outputs.is_tag }} ; then
echo "This is tag ${{ steps.branch-name.outputs.tag }}"
else
echo "This is branch ${{ steps.branch-name.outputs.current_branch }}"
echo "Is this branch the default one? ${{ steps.branch-name.outputs.is_default }}"
fi
Most software products are meant to be portable
A good continuous integration pipeline should test all the supported combinations
The solution is the adoption of a build matrix
if
conditionalsjobs:
Build:
strategy:
matrix:
os: [windows, macos, ubuntu]
jvm_version: [8, 11, 15, 16] # Arbitrarily-made and arbitrarily-valued variables
ruby_version: [2.7, 3.0]
python_version: [3.7, 3.9.12]
runs-on: ${{ matrix.os }}-latest ## The string is computed interpolating a variable value
steps:
- uses: actions/setup-java@v4
with:
distribution: 'adopt'
java-version: ${{ matrix.jvm_version }} # "${{ }}" contents are interpreted by the github actions runner
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python_version }}
- uses: ruby/setup-ruby@v1
with:
ruby-version: ${{ matrix.ruby_version }}
- shell: bash
run: java -version
- shell: bash
run: ruby --version
- shell: bash
run: python --version
We would like the CI to be able to
Both operations require private information to be shared
Of course, private data can’t be shared
printenv
)How to share a secret with the build environment?
Secrets can be stored in GitHub at the repository or organization level.
GitHub Actions can access these secrets from the context:
secrets.<secret name>
context objectSecrets can be added from the web interface (for mice lovers), or via the GitHub API.
#!/usr/bin/env ruby
require 'rubygems'
require 'bundler/setup'
require 'octokit'
require 'rbnacl'
repo_slug, name, value = ARGV
client = Octokit::Client.new(:access_token => 'access_token_from_github')
pubkey = client.get_public_key(repo_slug)
key = Base64.decode64(pubkey.key)
sodium_box = RbNaCl::Boxes::Sealed.from_public_key(key)
encrypted_value = Base64.strict_encode64(sodium_box.encrypt(value))
payload = { 'key_id' => pubkey.key_id, 'encrypted_value' => encrypted_value }
client.create_or_update_secret(repo_slug, name, payload)
Containers can be thought of as (but they are not) lightweight virtual machines:
Docker is the most common containerization technology (standard de facto).
We will interact with Docker through its Command Line Interface
docker [container|image] <command> <args>
[container|image]
optional target<command>
the action to perform<args>
the command’s argumentsdocker image
ls
prune
pull
rm
docker [container] run <options> image
Creates and executes a new container from image
container
” can be omitted-p <host>:<guest>
— publishes (exposes) the container’s port <guest>
to host’s port <host>
-v <host>:<guest>
— bind mount: mounts absolute path <host>
into the container at absolute path <guest>
-name <name>
— assigns a unique name to the container
-i
— interactive mode. Required to send commands to the container
-t
— attaches a pseudo-TTY. Use the option to have the same feel of a terminal open in the container
run Create and run a new container from an image
-e <key>=<value>
— sets environment variable <key>
to <value>
-d
— detached: returns control to the host terminal and runs in background
docker container
ls --all
exec <name> <command>
<command>
in a running container named <name>
run
: -i
, -t
, -e
, -d
top
rename <old> <new>
<old>
to <new>
rm <names>
prune
Images are defined in Dockerfiles
# Starting point. SCRATCH to start from empty (not recommended)
FROM imagename:version
# Run a command in the current image. Side effects will be stored
RUN some command
# Copy a file into the image
COPY file/in/host /destination/in/image
# Set an environment variable
ENV MY_VARIABLE=MYVALUE
# Change directory
WORKDIR /my/new/directory
# Configure the container to run as an executable
ENTRYPOINT ["executable", "parameter", "parameter2"]
# Default command
CMD ["executable", "parameter", "parameter2"]
Tagging is the operation of adding custom symbolic names (aliases) to images
docker [image] tag <source_image> <target_image>
Creates an alias for <source_image>
named <target_image>
image
” can be omittedTags are usually structured as name:version
name
is typically in the form owner_name/image_name
owner_name/
version
can be any string, but:
latest
is a special version that identifies the most recent image3.10
)
python:3.10
contains the Python interpreter at version 3.10
python:3.10-buster
contains the Python interpreter at version 3.10
and the runtime of Debian Busterdocker [image] build <options> <directory>
Creates a new image
from a directory containing a Dockerfile
image
” can be omittedDockerfile
located:
docker build .
-t <tag>
— adds tag <tag>
to the image. Multiple tags can be specified.docker build -t my_name/my_project:latest -t my_name/my_project:1.0.0 .
Images are fetched and stored in registries
dockerhub.io
ghcr.io
docker login <registry>
<registry>
is omitted, dockerhub.io
is useddocker login <registry>
docker login -u <username> -p <password> <registry>
cat <password> | docker login -u <username> --password-stdin <registry>
docker push <image>
<image>
to the registry the user is currently logged into<user>
, then the image must be tagged as <user>/<name>:<version>
Your image can now be pulled by anyone!
Artefacts that are not archived are at a high risk of being lost forever!
With them, away goes the possibility of reproducing the results.
Zenodo is a service that allows to archive and share scientific artifacts. Its key features are:
To automatically archive a repository on Zenodo:
A legal instrument used to regulate access, use, and redistribution of software
Legal right that grants the creator of an original work exclusive rights for its use and (re)distribution
Practice (not a legal right!) in which the creators surrenders some, but not all, rights under copyright law.
Possession of a copy of software.
The possession implies right to use, even if such use implies a violation of the license (e.g. for making changes to the software, or making incidental copies).
The software is not sold, but merely “licensed”, namely permitted to be used, under the conditions of a End-user license agreement (EULA).
The software publisher grants the right to use a certain number of copies under the conditions of an EULA, but does not transfer ownership of the copies to the customer. Usage of the software may be subjected to acceptance of the EULA.
The software publisher grants extensive rights to modify and redistribute the software, often prohibiting rolling back such rights (strong copyleft).
Much easier for italian speakers:
Free of charge
The user receives the source code of the software, is allowed to modify and redistribute it.
Usually together, but:
There are non-free open source licenses:
And there are free non-open source licenses as well
GNU General Public License
GNU Lesser General Public License (LGPL)
GNU General Public License with linking exception
Example exception
Linking this library statically or dynamically with other modules is making a combined work based on this library.
Thus, the terms and conditions of the GNU General Public License cover the whole combination.
As a special exception, the copyright holders of this library give you permission to link this library with
independent modules to produce an executable, regardless of the license terms of these independent modules, and to
copy and distribute the resulting executable under terms of your choice, provided that you also meet, for each linked
independent module, the terms and conditions of the license of that module. An independent module is a module which
is not derived from or based on this library. If you modify this library, you may extend this exception to your
version of the library, but you are not obliged to do so. If you do not wish to do so, delete this exception
statement from your version.
NOTICE
file with authors is provided, it must be included when redistributed. Entries can be appended. DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
Copyright (C) 2004 Sam Hocevar <sam@hocevar.net>
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. You just DO WHAT THE FUCK YOU WANT TO.
Available rights
BY
(Attribution) – Derivative works must credit the original authorSA
(Share-alike) – Enables copyleftNC
(Non-commercial) – — Derivative work can only be used for non commercial purposesND
(No derivative Works) – Free distribution and copy, but derivatives are forbiddenValid combinations
CC0
– Public domain (prefer the MIT licens for a similar protection)CC-BY
– AttributionCC-BY-SA
– Attribution, Share-alike (enables copyleft)CC-BY-NC
– Attribution NoncommercialCC-BY-NC-SA
– As above, plus copyleftCC-BY-ND
– Attribution Noderivatives (commercially usable, but not modifiable)CC-BY-NC-ND
– As above, non commercialLICENSE
or COPYING
plain text file in the repository with the full license text
Your software is now licensed!
Documentation of a project is part of the project
Two possibilities:
git checkout --orphan <branchname>
docs/
folder in a root of some branch
master
or any other branchOnce done, enable GitHub pages on the repository settings:
https://<username>.github.io/<reponame>
https://<username>.github.io/
<username>.github.io
https://<organization>.github.io/
<organization>.github.io