Ongoing list of PhD resources I like

I am a big sucker for productivity and love finding tools to get thing done better and quicker. So when I feel like taking a break from my research work or straight out want to procrastinate, I often find myself browsing the web looking for tools to improve my writing, R code, or some advice that helps me with my research life. And there is a lot out there! Here’s as ongoing list of resources I like. Because why reinvent the wheel if someone else has already done the work before you?

Raul Pacheco-Vega’s blog

This is currently my favourite blog. Raul Pacheco-Vega is a Mexico-based assistant professor political science, doing some incredibly work sharing his advice on all things academia in many series of blog posts. What I think is particularly great, is that he makes a lot of the ‘implied, gutfeelingtype’ knowledge about the research process explicit. Whether it is paper writing, organising your day or work-life balance, fair chance he’s got you covered.

The Thesis Whisperer

The blog is aimed at PhD students, and discusses all kind of issues you can come across in the PhD journey: feeling stuck with your writing or your project, doing a good literature review, dealing with feedback and much more. The blog is run by Inger Mewburn, an assistant professor and research training director based in Australia. What I really like about this blog, is that Inger Mewburn provides a platform for other (ex-)graduate student to share their experiences. It gives a very ‘real’ insight into PhD life.

How to write a better minor thesis

I read this book when I wrote my master’s thesis, and it was just super helpful so I think everyone should know about it. It helped me shaping my arguments and building the structure of a long text. I haven’t used major thesis version of this book yet, but definitely plan to.

Our World in Data

Because data is awesome, and it’s good to be informed about the world around you. Website by Max Roser and team.

Fundamentals of Data Visualization

Which plot to make and how to make it readable? So many good ideas on this page! And upon further inspection it turns out that Clause Wilke, the author of this blog and professor of integrative biology in the US, has written many blog posts with questions you may find yourself asking as a researcher. Recommended!

Advertisements

Grasp that topic! Getting started with a new area of research

When I started my PhD – about three months ago – I found myself having to learn the ins and outs of a whole new topic. Really fun, as I love learning new things, but also kind of like being shipwrecked and floating in the middle of the ocean with no idea which way to swim or where the nearest land might be. Help!

Introductory books or guides are often a good way to get started with a new topic, but there’s one small issue with that: it requires actually having that introductory book or guide (or at least knowing what and where it is). So how to find your way in the vast sea of information? Here’s 5 things that worked for me:

1. Good old Wikipedia
Even though academia did a thorough job at training me to trust only peer-reviewed literature (and even then be skeptic), I don’t find reading peer-reviewed articles particularly helpful when I am new to a field. It always kind of feels like trying to read the blueprint for a coffee machine, when all I really want is the quick start guide. Just show me where the on-button is so I can make my coffee! Wikipedia is like that quick start guide. Just type in your general area of research and poof, there you go.

2.Twitter
#sciencetwitter is a thing and used by scientists all over the world. It’s a great resource to find out about things that are happening right now in your field. Allow me to demonstrate. Here’s 4 tweets that show up on top results page of Twitter if I just type in ‘mangrove’:

Result: a new, very relevant paper, a video that shows how mangroves protect us (though I find it rather puzzling that it’s posted by an account named Destroying Stuff), someone speaking about mangroves at a conference, and a passionate post about mangroves in Kenya. Pretty cool right? Don’t think I would have found these right way with my standard literature search.

3. YouTube
You can read as much as you want, a picture tells a thousand words and a video even more. I found YouTube a great resource to understand the new ecosystem I am studying. The videos of the different aspects of the mangrove forest were illustrative in a way that a scientific paper could never be.

youtube mangrove search results

Not bad for an intro to mangrove ecology

I am not sure if YouTube can work as well for non-ecology topics, but in any case, give it a try. You might stumble upon some useful lectures

4. Google Scholar: download all the things!
I’m the kind of person who prefers to have a broad overview before I dive in, and I love collecting bits and pieces of new information. Two days in to my first literature search (with basic search terms like mangrove ecology and coastal protection) I had downloaded over 200 papers, books and book chapters. This may seem a little excessive, but I was quite happy with my fresh stack of pdf’s. Here’s what I did with them:

  • I loaded all the papers into my reference manager (Mendeley), which automatically renamed and sorted them for me. Then, I had a look at the their title, journal and, if really curious, the abstract. This helped me to paint a mental image of the various topics in my research area, and also helped me identify materials that I wanted to started reading more thoroughly.
  • Sorting the papers by author gave me some idea of the key players in the field, and sorting by journal pointed me to the journals where I should keep an eye out for new publications (tip: sign up to their mailing list).
  • Finally, I sorted the papers by year, so I had rough idea of the way the field has changed over the past decades and when the current hot topics starting emerging. I continue to repeat these steps as I collect more articles to refine my data on major players, big journals and important years in the field.

5. Mindmap it
This tool is not as useful on day 1 as the tips above, because you kind of need to know your area of research a tiny, tiny bit (I did this three weeks in) but I think it is still worth mentioning. I love mind mapping, because it forces me to think about the stuff I read and I often end up making connections (not always sensible ones, but still, it’s good to get creative) I wouldn’t have thought of if I just left all the info floating around in my brain.

Mindmap in powerpoint

Mindmapping in PowerPoint

What I did: based on all the stuff I had read, I came up with a new list of keywords (about 40 words or so). I fired up PowerPoint and wrote each of the words in a text box and then starting drawing and redrawing arrows all over the place. Quite fun, and a nice way to take a break from the endless early-PhD reading sessions for a while.

I’d love to hear what works for you. What do you do when starting out with a new topic? What do you think of these tools?

Tutorial: Running Rstudio in the NeCTAR cloud

One of my models is horribly slow and was going to take about two weeks to run. Although two week isn’t really that bad, I don’t like waiting…  So I decided to try NeCTAR. Running my models on a Virtual Machine sounded like an excellent solution. Just one little issue: I had absolutely no idea where to get started.

If you are like me (not familiar with virtual machines, cloud computing and command line interfaces), trying to use NeCTAR can be quite offputting. Especially if you don’t want to spend three days figuring out how it works. But lucky you, I did.

Running Rstudio in the cloud for total beginners

1. Logging in to your Dashboard.

  • Go here: https://dashboard.rc.nectar.org.au
  • Log in with your AAF credentials (your uni login). If you don’t have these, it means you will never ever ever be able to get in. Stop reading this and go to Amazon Web Services, they will give you a year of cloud computing services for free.
  • Once logged in, you will see your dashboard. Just click around for a bit to see where everything is, you don’t have to know what any of these things mean yet.
    Note: make sure you are always working in the right project. If this is your first time in the NeCTAR Dashboard, don’t worry about this. You should only have one project, which is a trial project that runs for about 3 months.
Your NeCTAR Dashboard will look similar to this

Your NeCTAR Dashboard will look similar to this – click on picture to enlarge

2. Creating a Key Pair

A key pair is an added safety feature you will need to start an instance (an instance is where your virtual machine will be running) and get access to your instance. A key pair has two parts: a public part and a private part. The public part is like the lock on your door: public to everyone, but only the person with the key (the private part), can get access to what’s behind it. Just like with physical keys, don’t lose your private key.

  • On your Dashboard, go to: Project > Compute > Access & Security. At the top left (sorf of) of the screen, choose Key Pairs. Your screen should now show something like this:

    Key Pair page

    Key Pair page. I have one key pair, but you might have none.

  • Click on Create Key Pair > choose a name > Create Key Pair. A window will pop up and prompt you to download your <yourkeypairname>.pem. Download it and don’t lose it. This is your private key.
  • That’s all. The public key will be stored somewhere on NeCTAR, so you don’t have to worry about that.

3. Make a security group

Security groups are the security guards of your virtual machine: they decide who gets in and who doesn’t, and how. We are going to make our own security group.
Note: NeCTAR is very careful with security, so please read this as well: https://support.rc.nectar.org.au/docs/security-guidelines.

  • Go to: Project > Compute > Access & Security. At the top of the screen, choose Key Pairs. Your screen should now show something like this:

    Security Group page

    Security Group page

  • Click on Create Security Group. Choose a name and description and click on Create Security Group. I doesn’t matter what you fill in at this point, you can always go back and change or delete it.
  • Your new group should now be added to the list. Next, we are going to add some rules. These rules will explain to our virtual security guards who get’s in (which IP’s) and how (through which ports).
  • Click on Manage Rules > Add rule (in the next screen at top right).

    Add rule

    Add rule

  • Fill out all the fields as in the picture above and click Add. Port 22 stands for SSH access (instead of ‘Custom TCP Rule’ you can also choose ‘SSH’ in the top field). CIDR states which computer (IP) addresses get access. 0.0.0.0/0 means that everyone can get in.
  • Add two more rules. Keep all the fields the same as in the picture above, but for the second rule fill out Port 80, which stands for HTTP. For the third rule, use Port 8787, the default access port of the Rstudio server.
  • You have now told the Security Group that it should only allow access to the virtual machine through port 22, 80 and 8787, but every IP address can get in.

4. Create an instance

  • Your virtual machine will be running on what is called an instance. To launch an instance go to Project > Compute > Instances. At the top right of the screen, choose Launch Instance. This window will come up:

    Launch Instance

    Launch Instance. I already have one instance running.

  • In the Details tab, choose a name for you instance. Also, you will have to choose a flavour. I recommend to choose m1.small for now, as that will take up half the  allocated space in your trial project. Under image name, choose NeCTAR Ubuntu Trusty (or just do what it shows in the screenshot above).
  • Under Access & Security, choose the Key Pair you made in step 2 and select the Security Group you made in step 3.
  • Click Launch. Now your virtual machine will start loading. Once the Power State says Running, you can move on to the next step.

5. Accessing your instance

Now this step is a little bit trickier, because it depends on what computer you are currently using. Here, I’m going to explain how to do this using a Windows computer. If you are running a Mac (or Linux or something else Unixy), it’s actually more straigthforward, just open a terminal and type this:

ssh -i mykeypairname ubuntu@XX.XX.XX.XX

(you can find you IP address on the Instances page) and wait for the Windows users to catch up in step 6.

  • Windows users: Download PuTTY and PuTTYgen here. PuTTY is an SSH client for Windows, you will need this to get access to your instance. You are going to use PuTTYgen to translate the <yourkeypairname>.pem file into something Windows understands. Silly Windows.
  • Open PuTTYgen.

    PuTTYgen

    PuTTYgen

  • Click Load, and load the <yourkeypairname>.pem file you downloaded earlier. Now click Save private key. Give the new file the same name as your old .pem file (but the extension should be .ppk on your new file).
  • Close PuTTYgen and open PuTTY. We are now ready to SSH our way into the instance. PuTTY should open up in the Session screen.

    PuTTY

    PuTTY’s Session screen

  • Under Host Name, fill out: ubuntu@XX.XX.XX.XX. You can find you instance’s IP address on the Instances page and place that instead of the XX.XX.XX.XX.
  • Make sure port 22 is selected and that connection type SSH is selected. Don’t worry about the rest.
  • Next go to Connection > SSH > Auth:

    PuTTY

    PuTTY

  • In the Authorization screen, choose Browse. Open the <yourkeypairname>.ppk file we created earlier using PuTTYgen. Once you have done that, click Open in the PuTTY screen. We are now SSHing our way into the instance. Yay!
  • If a warning pops up, click Yes (assuming you trust the connection we are making). It’s just doing that because it has never made this connection before.

    PuTTY security message

    PuTTY security message

6. First time in the instance

Mac users: you can join again!

  • We have now connected to our instance. Your screen should look something like this:

    The instance

    The instance

  • If it is asking for a username or something like that, type ‘ubuntu’.
  • We are now in our virtual machine with ubuntu as an operating system. We can  start telling our operating system things we want it to do, such as running R.

7. Installing R and the Rstudio server on Ubuntu Trusty

  • After some websearching I came accross this awesome bit of code provided by yhat. Type the following bits of code in the ubuntu server and it will install R and Rstudio:
# adding a username. follow the prompts (no need to fill out your name and address)
sudo adduser USERNAME

# update
sudo apt-get update

# install r
sudo apt-get install r-base

# install rstudio
sudo apt-get install gdebi-core
sudo apt-get install libapparmor1

# check rstudio.com for the newest version
wget http://download2.rstudio.org/rstudio-server-0.98.1103-amd64.deb
sudo gdebi rstudio-server-0.98.1103-amd64.deb

# make sure everything is working
sudo rstudio-server verify-installation
  • Now open your browser and go to: XX.XX.XX.XX:8787. It will prompt you for the username and password you just created with ‘sudo adduser’. Fill those out and you’re in!

Remember to shut off your instance when you are finished so you don’t lose valuable time on your free NeCTAR project trial (Dashboard > Project > Compute > Instances > Actions > Shut off Instance).

ResBaz research tools

ResBaz, or Research Bazaar, was hosted this year for the first time at the University of Melbourne and I got to attend this awesome conference (so did a couple of other QAECOlogists, check Saras Windecker’s blogpst for an impression). ResBaz is all about open science, sharing, collaborating and getting the message out there. To make this happen, the attending researchers were trained in digital tools such as R, MATLAB, Phython, mapping software and version control.

I only attended two streams, R and SQL (databases), but because of all the mingling with fellow researchers under the big bazaar tent (as well as twittering, lots of twittering!), I also got a good impression of all the other great stuff that’s out there. I have to say, it’s good to know what all the options are when doing your research. If you missed the conference, here’s a list of all the cool research tools you really want to know about:

Version control and other important things

Unix Shell
What is it?
The cornerstone of all programming.

Why do you want to use it?
It makes life easier. Once you get over your fear of accidentally removing important files (we got warned about this a couple of times), it is incredibly useful for automating repititive tasks and linking existing programs together.

 

GitHub
What is it?
A version control tool for software.

Why do you want to use it?
Because it makes it very difficult to lose stuff, it’s like you’ve got an undo button. It also makes collaborations much easier. Not convinced? Here is a post explaining exactly why you need it: https://drclimate.wordpress.com/2012/11/16/version-control/.

 

NeCTAR
What is it?
An online research cloud that is free for all Australian researchers.

Why do you want to use it?
It allows you to run your analyses and simulations online on a remote computer (save time!) among other very good reasons. Damien Irving explains it really well in this post.

 

Data analysis tools

R
What is it?
Programming language for statistical analysis and making pretty graphs.

Why do you want to use it?
It is free (yay!), it is used a lot (which means there is lots of help and support when R doesn’t do what you want it to do) and it is very powerful.

 

Phyton
What is it?
A snake  A programming language used for many different applications.

What do you want to use it?
It’s a well written, simple language with an English like syntax and there are lots of specialist libraries available.

I personally also like iPhyton Notebook because it let’s you share your analyses and code with others: http://ipython.org/notebook.html

 

MATLAB
What is it?
A high-level technical computing language. Its basic data element is the matrix.

Why do you want it?
Because it is really really good at data processing and also good at graphical output. It’s not so good at being a good programming language.

 

Mapping tools

For a detailed review of all the mapping tools at ResBaz check out Lizzy Lowe’s blog.

CartoDB
What is it?
A cloud-based mapping tool.

What do you want to use it?
Because it is a simple yet powerful tool that lets you visualise data onto maps.

 

TileMill
What is it?
A mapping tool to quickly and easily design maps for the web using custom data.

Why do you want to use it?
It is a more advanced tool that cartoDB and has many more options for map customisation, and also because of this:

 

AURIN
What is it?
It contains over 1300 Australian urban datasets on health, socio-economics, demographics and the built environment.

Why do you want to use it?
It can be used to combine and analyse datasets to reveal patters in the urban environment.

 

NLTK, Authorea and SQL

Natural Language Toolkit for Python
What is it?
A platform to work with human language data / a text mining tool.

Why do you want to use it?
To comment on features of the language in a large corpus of text, and because:

 

Authorea
What is it?
A new, collaborative writing tool that aims to change the way researchers write papers.

Why do you want to use it?
It makes writing much more interactive and collaborative. It backs up every change that is made, people can work together and everything can be written in LaTeX.

 

Databases & SQL
What is it?
SQL is a language used to communicate with databases.

Why do you want to use it?
Spreadsheets can only do some many calculations before their use becomes limited. Databases pick up where spreadsheets leave off. They are faster, can work with much larger datasets and do lots of stuff spreadsheets can’t do.

 

Want more?

And that’s not even all of the stuff that was on offer at the bazaar! Check out their website for the full details of all the tools that were available. If you feel like studying straight away, Software Carpentry offers online lessons. Is your code not working and you want a human to help you? If you are Melbourne Uni, Parkville, you can come to Hacky Hour. This happens every Thursday 3pm at the Tsubu bar.