Manhattan: ggplot2-based Manhattan plots Boxiang Liu 2017-12-15

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

Basic Manhattan plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

Customizing colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

Highlight and label SNPs and genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

Adding GWAS significance line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

Plotting multiple GWAS studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

Questions and Bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

Introduction The Manhattan plot is a specialized form of scatterplot to display genome-wide association studies (GWAS). The x-axis of a Manhattan plot is the genomic position, and the y-axis is usually the ≠log10 (P -value) (although other sensible metric can be used as well). There are many packages for making Manhattan plots, but most of them are not easily extensible. The package ggplot2 has become increasingly popular among the R and bioinformatics communities. Therefore, a package that is fully compatible with ggplot2 and customizable with its geoms will facilitate the use of Manhattan plots. Introducing manhattan, a ggplot2 based package for making Manhattan plots. It takes in a data.frame and returns a standard ggplot object, upon which the user can add other geoms. This makes it very convenient for users to build on manhattan and customize their plots.

Installation To install manhattan, use the standard R package installation command. # install.packages( manhattan ) If you want the latest development version, install it using devtools. devtools::install_github("boxiangliu/manhattan") The package has been tested on Linux and Mac OSX. It has not been tested on Windows.

1

Usage Basic Manhattan plot To illustate its usage, let us plot the coronary artery disease GWAS based on Deloukas et al(2013). The original dataset provides nominal p-values. Since we want to plot the ≠log10 (P -value), let us take the logarithm. library(manhattan) data(cad_gwas) cad_gwas$y=-log10(cad_gwas$pval) head(cad_gwas) ## ## ## ## ## ## ##

1: 2: 3: 4: 5: 6:

chrom chr1 chr1 chr1 chr1 chr1 chr1

pos 100098846 100128148 100183875 100258576 100351915 100385263

pval rsid y 0.180432 rs494626 0.7436864 0.030573 rs10747505 1.5146619 0.081842 rs1541044 1.0870238 0.423634 rs531174 0.3730092 0.518362 rs2810422 0.2853668 0.190763 rs499479 0.7195059

The dataset contains five columns: chrom, pos, rsid, pval, and y. Three of them are required: 1. chrom 2. pos 3. y The chrom and pos columns specify the genomic location (x-axis), and y specify the y-axis (duh!). The choice of the column name “y” is intentional - not every Manhanttan plot uses ≠log10 P -value as the y-axis. After loading the data, we are ready to make a Manhattan plot. Notice that Deloukas et al uses hg18. To get the chromosome lengths correctly, we specify hg18 as an argument. manhattan(cad_gwas,build= hg18 )

2

80

−log10(P−value)

60

40

20

0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 171819202122

Chromosome Ta-da! Our first Manhattan plot. Customizing colors If black and grey are dull, we can change the color of each chromosome. manhattan(cad_gwas,build= hg18 ,color1= skyblue ,color2= navyblue )

3

80

−log10(P−value)

60

40

20

0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 171819202122

Chromosome Highlight and label SNPs and genes A common task is to highlight and annotate SNPs of interest. The package manhattan requires color and SNP labels to be specified in the input data.frame as two columns: color and label. Note that only SNPs of interest have color strings, and other SNPs must be left as NA. Let us highlight two SNPs rs602633 and rs1333045. cad_gwas[cad_gwas$rsid== rs602633 , color ]= green cad_gwas[cad_gwas$rsid== rs1333045 , color ]= red manhattan(cad_gwas,build= hg18 )

4

80

−log10(P−value)

60

40

20

0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 171819202122

Chromosome We could also label the two SNPs. Note again that only SNPs of interest have label strings, other SNPs should be left as NAs. Since manhattan returns a ggpplot object, we could just add a geom_text layer. cad_gwas[cad_gwas$rsid== rs602633 , label ]= rs602633 cad_gwas[cad_gwas$rsid== rs1333045 , label ]= rs1333045 manhattan(cad_gwas,build= hg18 )+geom_text(aes(label=label),hjust=-0.1) ## Warning: Removed 79126 rows containing missing values (geom_text).

5

rs1333045 80

−log10(P−value)

60

40

rs602633 20

0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 171819202122

Chromosome Adding GWAS significance line It is worth noting that any standard geoms can be used with manhattan. For instance, let’s add a line indicating genome-wide significant threshold.

manhattan(cad_gwas,build= hg18 )+geom_text(aes(label=label),hjust=-0.1)+geom_hline(yintercept=-log10(5e## Warning: Removed 79126 rows containing missing values (geom_text).

6

rs1333045 80

−log10(P−value)

60

40

rs602633 20

0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 171819202122

Chromosome Plotting multiple GWAS studies We can go even further to use facets with manhattan. For illustration, let us pretend that we have two GWAS studies by duplicating cad_gwas, and plot two GWAS studies on top of each other. cad_gwas_2=rbind(cbind(cad_gwas,study= GWAS 1 ),cbind(cad_gwas,study= GWAS 2 )) manhattan(cad_gwas_2,build= hg18 )+facet_grid(study~.)

7

80

60 GWAS 1

40

−log10(P−value)

20

0 80

60 GWAS 2

40

20

0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16171819202122

Chromosome

Details Behind the scene, manhattan is no more than a wrapper around ggplot, with a few tricks to transform a genomic axis to a scatterplot axis. In brief, manhattan transforms the chrom:pos pairs to cumulative positions. For instance, chr2:1 would be the length of chromosome 1 plus 1, chr3:1 would be chromosome 1 plus chromosome 2 plus 1, so on and so forth. Therefore, it is important to specify the genomic build (e.g. hg19) so that manhattan can make the correct transformation. It then positions the chromosome labels on the x-axis according to these transformations. Again, it is important to note that manhattan is no more than a wrapper around ggplot, which makes manhattan highly customizable. For instance, we can change the size of the SNPs of interest by adding a geom_point layer. cad_gwas$size=ifelse(cad_gwas$rsid%in%c( rs602633 , rs1333045 ),5,2) manhattan(cad_gwas,build= hg18 )+geom_point(aes(color=color,size=size),show.legend=FALSE)

8

80

−log10(P−value)

60

40

20

0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 171819202122

Chromosome

Questions and Bugs If you have any question or want to report a bug, please open a github issue here.

9

Manhattan: ggplot2-based Manhattan plots - GitHub

Dec 15, 2017 - The x-axis of a Manhattan plot is the genomic position, and the y-axis is usually the ≠log10(P-value). (although other sensible metric can be used as well). There are many packages for making Manhattan plots, but most of them are not easily extensible. The package ggplot2 has become increasingly ...

40MB Sizes 1 Downloads 235 Views

Recommend Documents

Manhattan Community District 5 - GitHub
Park. MN 4. MN 6. MN 2. MN 8. Manhattan Community District 5. Neighborhoods1: Flatiron, Gramercy Park, Herald Square, Midtown, Midtown South, Murray Hill, Times Square, Union Square. Top 3 pressing issues identified by. Manhattan Community Board 5 in

Manhattan Community District 12 - GitHub
23%. Manhattan CD 12. LIMITED ENGLISH PROFICIENCY4 of residents 5 years or older have limited · English proficiency. Manhattan. 14%. 20%. NYC. 21%. Manhattan CD 12 of residents have incomes below the NYCgov poverty threshold. See the federal poverty

Manhattan Community District 8 - GitHub
Needs and Community Board Budget · Requests for Fiscal Year 2018. A Snapshot of Key Community Indicators. Website: www.cb8m.com. Email: info@cb8m.

Manhattan Community District 6 - GitHub
16%. 6%. NYC. 23%. Manhattan CD 6. LIMITED ENGLISH PROFICIENCY4 of residents 5 years or older have limited · English proficiency. Manhattan. 14%. 11%. NYC. 21%. Manhattan CD 6 of residents have incomes below the NYCgov poverty threshold. See the fede

Manhattan Community District 4 - GitHub
Page 1. 1%. 6%. 12%. 15%. 13%. 3%. 28%. 10%. 2%. 3%. 6%.

Manhattan Community District 3 - GitHub
E 14 St. EastRiver. MN 6. MN 2. MN 1. Manhattan Community District 3. Neighborhoods1: Chinatown, East Village, Lower East Side, NoHo, Two Bridges. LAND USE MAP. 164,407. 163,277. -1% ... ACCESS TO PARKS7 of residents live within.

Manhattan Community District 10 - GitHub
21%. 22%. 23%. 5%. 1%. 1%. 15%. 6%. 1%. 4%.

Manhattan Community District 7 - GitHub
Page 1 ... E 79 St. MN. 11. Hudson. R iver. Central. Park. MN 9. MN. 10. MN 8. Manhattan Community District 7. Neighborhoods1: Lincoln Square, Manhattan Valley, Upper West Side. Top 3 pressing issues identified by. Manhattan Community Board 7 in 2017

Manhattan Community District 11 - GitHub
Email: [email protected] · See MN 11's · profile online. Manhattan. 18%. 33%. NYC. 21%. Manhattan CD 11 of residents had incomes · below the poverty level.

Manhattan Community District 1 - GitHub
for Public Use Microdata Areas (PUMAs). PUMAs are geographic approximations of community districts. MN 1 shares PUMA 3810 with MN 2, and the ACS population estimate cannot be reliably disaggregated. 5NYC Dept of City Planning Facilites Database (2017

Manhattan Community District 9 - GitHub
E 125 St. 5 A v. Broad w ay. E 120 St. H u d so n. R ive r. MN 12. MN 10. MN 7. MN 11. Manhattan Community District 9. Neighborhoods1: Hamilton Heights, Manhattanville, Morningside Heights, West Harlem. Top 3 pressing issues identified by. Manhattan

Manhattan Community District 2 - GitHub
for Public Use Microdata Areas (PUMAs). PUMAs are geographic approximations of community districts. MN 2 shares PUMA 3810 with MN 1, and the ACS population estimate cannot be reliably disaggregated. 5NYC Dept of City Planning Facilites Database (2017

Manhattan Community District 8 Basemap - GitHub
Theodore. Roosevelt. Park. Rainey. Park. Carl. Schurz. Park. Wards. Island. Park. Central. Park. 2 A. V. 2 A. V. 2 A. V. 2 A. V. 21. ST. 21. ST. 1 A. V. 1 A. V. 1 A. V. 1 A. V. WEST 84 ST. 34 AV. 27 AV. 36 AV. F D. R. DR. F D. R. DR. BR. OA. DW. AY.

Manhattan Community District 7 Basemap - GitHub
Theodore. Roosevelt. Park. Riverside. Park. R ive rsid e. P ark. So uth. M o rn in gsid e. P ark. Central. Park. 12 A. V. WEST 112 ST. EAST 92 ST. 5 A. V. 5 A. V. 5 A. V. EAST 69 ST. 97. ST. TRANSVERSE. WEST 68 ST. WEST DR. WEST 118 ST. WEST 108 ST.

Manhattan Community District 5 Basemap - GitHub
De Witt. Clinton. Park. Bryant. Park. Central. Park. JO. ED. IM. AG. GIO. HIG. HW. AY. 12 A. V. WEST 34 ST. WEST 34 ST. WEST 21 ST. WEST 25 ST. WEST 57 ST. EAST 57 ST. WEST 35 ST. WEST 35 ST. WEST 31 ST. WEST 36 ST. WEST 36 ST. EAST 34 ST. EAST 44 ST

Manhattan Community District 9 Basemap - GitHub
St. Nicholas. Park. Marcus. Garvey. Park. Riverside. Park. R ecreatio nal. A rea. M o rn in gsid e. P ark. LEX. IN. GT. ON. AV. W. EST. EN. D. AV. WEST 104 ST. BR. OA. DW. AY. BR. OA. DW. AY. BR. OA. DW. AY. WEST 139 ST. WEST 138 ST. WEST 151 ST. WES

Manhattan Community District 1 Basemap - GitHub
City. Hall Park. SPRING ST. CR. OSB. Y ST. VA. RIC. K. ST. PARK ROW. BR. O. O. KL. YN. Q. UE. EN. SE. XP. Y. SOUTH. ST. KING ST. RO. SE. ST. ATLANTIC AV. FDR DR VIADUCT. PELL. ST. HUGH. L. CAREY. TUNL. BROOK. LYN. BRIDGE. UNION ST. GR. EE. NW. IC. H.

2010 Census Tracts within Manhattan Community District 12 - GitHub
Note: Census tracts and community district boundaries are not coterminous. Some census tracts span multiple community districts or park areas. For census tract data visit Census FactFinder at maps.nyc.gov/census. Community. District Boundary. 2010 Ce

2010 Census Tracts within Manhattan Community District 9 - GitHub
Created: August 2017. 0. 0.4. Miles. °. Community District. Boundary. 2010 Census Tract. Park. Note: Census tracts and community district boundaries are not coterminous. Some census tracts span multiple community districts or park areas. For census

2010 Census Tracts within Manhattan Community District 6 - GitHub
Note: Census tracts and community district boundaries are not coterminous. Some census tracts span multiple community districts or park areas. For census tract data visit Census FactFinder at maps.nyc.gov/census. Community. District Boundary. 2010 Ce

manhattan proper brunch.pdf
Page 1 of 2. Eggs & Things. Fiesta Omelet. 3 egg omelet/ bell pepper/ red onion/ cilantro/ pepper jack cheese/ spicy guacamole/ side of breakfast. potatoes. Chocolate Pancake Stack. buttermilk pancakes/ chocolate chips/ maple syrup/ powdered sugar. S