RLint: Reformatting R Code to Follow the Google Style Guide Andy Chen

Alex Blocker, ([email protected]), Andy Chu, Tim Hesterberg, Jeffrey D. Oldham, Caitlin Sadowski, Tom Zhang 2014-07-02

Summary RLint checks and reformats R code to follow R style guide. RLint used within Google. ● Eases checking correctness. ● Improves programmer productivity. Suggest experiment adopting consistent style guide + RLint. ● Does it improve your team's productivity?

Google Confidential and Proprietary

Style guides improve correctness and productivity Q: How do we produce correct R code when ● correctness is hard to check, ● R programmer time is expensive? Q: How do we maintain correct R code when ● modified by different programmers?

Google Confidential and Proprietary

Many R files modified by multiple users ~40% files modified by >1 Googler.

~50% directories contain code written by >1 Googler.

# Googlers modifying R file

% R files

# Googlers modifying R code in directory

% R directories

1

60.9%

1

52.0%

2-3

33.7%

2-3

36.6%

4-5

3.8%

4-5

7.0%

6+

1.5%

6+

4.4%

Google Confidential and Proprietary

Style guides improve correctness and productivity Q: How do we produce correct R code when ● correctness is hard to check, ● R programmer time is expensive? Q: How do we maintain correct R code when ● modified by different programmers? A: R style guide specifies uniform coding

Google Confidential and Proprietary

Style guides specify program structure Google R style guide specifies ● identifier naming: variable.name, FunctionName, kConstantName ● layout: indentation, spacing, ... ● comments ● function commenting ● ... Success criterion: Any programmer should be able to ● instantly understand structure of any code. Consistent style more important than "perfect" style. Google Confidential and Proprietary

RLint: Automate style checking and correction Goal: Minimize overhead of following style guide. RLint: Program warning style violations. ● Optionally produce style-conforming code. ● Key idea: Computers are cheap. Use within Google: ● All code violations flagged by code review tool. ● Violations must be corrected before code submission.

Google Confidential and Proprietary

Ex: Spacing Code:

foo <-function(x){ return (list ( a = sum(x[,1]), b = 1/3+1e-7*(x[1,1])) …

Warnings: ● Place spaces around all binary operators (=, +, -, <-, etc.). ● Place a space before left parenthesis, except in a function call. Corrected:

foo <- function(x) { return(list( a = sum(x[, 1]), b = 1/3 + 1e-7 * (x[1, 1]) ... Google Confidential and Proprietary

Ex: Indentation Code if (x == 5) while (x > 1) x <- x - 1 print(x)

Is anything wrong?

Google Confidential and Proprietary

Ex: Indentation Code if (x == 5) while (x > 1) x <- x - 1 print(x)

# R-bleed bug?

;)

Corrected code if (x == 5) while (x > 1) x <- x - 1 print(x) Google Confidential and Proprietary

Ex: Ease checking program correctness Code x <- -5:-1 x[x <-2]

Is anything wrong?

Google Confidential and Proprietary

Ex: Ease checking program correctness Code x <- -5:-1 x[x <-2]

# Hmm ...

Warning Must have whitespace around <-, <<-, etc Corrected code x <- -5:-1 x[x <- 2] Google Confidential and Proprietary

Ex: Ease checking program correctness Code if (format(Sys.time(), "%Y") == "2014") { print(paste("UseR!", "2014") }

Is anything wrong?

Google Confidential and Proprietary

Ex: Ease checking program correctness Code if (format(Sys.time(), "%Y") == "2014") { print(paste("UseR!", "2014") } Error CRITICAL:root:Unbalanced brackets in { print(paste("UseR!", "2014") } Google Confidential and Proprietary

RLint implementation uses Python Use Python string functions and regular expressions. Algorithm: Stub out comments, strings, user-defined operators. ● Ex: Comment may contain code! ● Ex: Multi-line string Check spacing. Align & indent lines within {}, () and []. ● Align lines by opening bracket. ● Align lines by ‘=’ if they are in the same bracket. Align if/while/for (...) not followed by {}. Unstub comments, strings, user-defined operators. Google Confidential and Proprietary

Application: Improve R community's style consistency Proposal: Adopt R style guide + RLint. ● Run experiments to determine net benefit. Small scale: Individual teams (pkgs) adopt style guide + checker. ● Are these programmers more productive? ● More bug fixes and fewer (un-fixed) bug reports? Medium scale: CRAN packages opt into style guide + checker. ● Specify style guide + checker program. ● Enforced by CRAN server farm. Google Confidential and Proprietary

Summary RLint checks and reformats R code to follow R style guide. RLint used within Google ● Eases checking correctness. ● Improves programmer productivity. Suggest experiment adopting consistent style guide + RLint. ● Does it improve your team's productivity?

Google Confidential and Proprietary

RLint: Reformatting R Code to Follow the Google Style Guide Andy Chen

Alex Blocker, ([email protected]), Andy Chu, Tim Hesterberg, Jeffrey D. Oldham, Caitlin Sadowski, Tom Zhang 2014-07-02

Coding conventions and checkers Coding conventions have existed for decades. ● 1918: The Elements of Style by Strunk & White (writing English) ● 1974: The Elements of Programming Style (writing code) ● 1997: Java code conventions ● 2001: Python style guide ● 2014: Google style guides for 12 languages available Style checkers have existed for decades. ● 1977: lint checks C style ● 2002: PyChecker checks Python style ● 2011: gofmt reformats Go code (70% adoption in 2013) Google Confidential and Proprietary

RLint: Reformatting R Code to Follow the ... - Research at Google

Jul 2, 2014 - Improves programmer productivity. Suggest ... R programmer time is expensive? .... Application: Improve R community's style consistency.

498KB Sizes 5 Downloads 282 Views

Recommend Documents

Follow-the-Regularized-Leader and Mirror ... - Research at Google
as online gradient descent) have an equiva- ... We consider the problem of online convex optimization ...... a variety of datasets to illustrate the key differences.

A New Approach to Optimal Code Formatting - Research at Google
way in which alternate program layouts are provided to the algorithm, and the .... and a cost α for each line break output.3 Given a collection of alternative layouts for ... expression with a function—call it the layout's minimum cost function—

A Loopless Gray Code for Minimal Signed ... - Research at Google
See http://www.cs.stanford.edu/∼manku/projects/graycode/index.html for source code in .... is terminal(i) returns true iff (ai = ti and di = 1) or (ai = 1 and di = −1).

Sound Ranking Using Auditory Sparse-Code ... - Research at Google
May 13, 2009 - and particularly for comparison and evaluation of alternative sound ... the (first) energy component, yielding a vector of 38 features per time frame. ... Our data set consists of 8638 sound effects, collected from several sources.

Accuracy at the Top - Research at Google
We define an algorithm optimizing a convex surrogate of the ... as search engines or recommendation systems, since most users of these systems browse or ...

Searching help pages of R packages - Research at Google
34 matches - Software. Introduction. The sos package provides a means to quickly and flexibly search the help ... Jonathan Baron's R site search database (Baron, 2009) and returns the ..... from http://addictedtor.free.fr/rsitesearch . Bibliography.

R-code online.pdf
... say this represents the law of diminishing. returns. x=1:100. y=log(x). plot(x,y,main='Diminishing returns',xlab='Invested resources',ylab='Return on investm.

R-code online.pdf
There we go! Now hopefully on to some more interesting stuff! Page 3 of 3. R-code online.pdf. R-code online.pdf. Open. Extract. Open with. Sign In. Main menu.

Improving Access to Web Content at Google - Research at Google
Mar 12, 2008 - No Javascript. • Supports older and newer browsers alike. Lynx anyone? • Access keys; section headers. • Labels, filters, multi-account support ... my screen- reading application, this site is completely accessible for people wit

Mathematics at - Research at Google
Index. 1. How Google started. 2. PageRank. 3. Gallery of Mathematics. 4. Questions ... http://www.google.es/intl/es/about/corporate/company/history.html. ○.

Migrating to BeyondCorp - Research at Google
involved, from the teams that own individual services, to management, to support teams, to ... a new network design, one that removes the privilege of direct.

GRAPHEME-TO-PHONEME CONVERSION ... - Research at Google
model and the performance of the overall system relies on the coverage and quality of .... knowledge is the first such application of LSTMs. 4. LSTM-BASED G2P ...

Google's Hybrid Approach to Research - Research at Google
To compare our approach to research with that of other companies is beyond the scope of this paper. ... plores fundamental research ideas, develops and maintains the software, and helps .... [8], Google File System [9] and BigTable [10]. 2.

Mobile Computing: Looking to the Future - Research at Google
May 29, 2011 - Page 1 ... fast wired and wireless networks make it economical ... ple access information and use network services. Bill N. Schilit, Google.

Using Machine Learning to Improve the Email ... - Research at Google
Using Machine Learning to Improve the Email Experience ... services including email, and machine learning has come of ... Smart Reply: Automated response.

Learning to Rank Recommendations with the k ... - Research at Google
optimize precision at k than the WARP loss, which WARP was designed for. Secondly, it can optimize novel metrics like maximum mean rank which we believe ...

Using the Wave Protocol to Represent ... - Research at Google
There are several challenges in aggregating health records from multiple sources, including merging data, preserving proper attribution, and allowing.

Projecting the Knowledge Graph to Syntactic ... - Research at Google
lation; for example, the name of a book, it's author, other books ... Of the many publicly available KBs, we focus this study ... parse tree in the search space that is “not worse” than y. .... the parser accuracy in labelling out-of-domain en- t

Near to the Brain: Functional Near-Infrared ... - Research at Google
Lightweight Brain Imaging Technique for Visualization ... To validate fNIRS as a tool for visualization research, .... R. Ward, T. Williams, and R. J. K. Jacob. This is ...

Introduction to the Aggregate Marketing System ... - Research at Google
Apr 13, 2017 - 2015), and geo experiments (Vaver & Koehler, 2011), across complex modeling scenarios. ... Measuring media effects across many channels, for example, requires ...... In branded search campaigns, the advertiser benefits from a competiti

From Freebase to Wikidata: The Great Migration - Research at Google
include pages from online social network sites, reviews on shopping sites, file .... more than ten facts—in the center of the Figure—has been mapped. The items ...

Follow up
Dec 8, 2016 - bullhorns as a Facebook style alert system, as Mitch suggested, would avoid most ..... call this week and some school wanted an API change to ...

Faucet - Research at Google
infrastructure, allowing new network services and bug fixes to be rapidly and safely .... as shown in figure 1, realizing the benefits of SDN in that network without ...