CS 50 Walkthrough 6 Problem Set 6: Mispellings Marta Bralic Slides courtesy of: Keito Uchiyama

Problem Set 6: Mispellings •  Topics:
 –  More
data
structures,
more
pointers
 –  More
File
I/O


•  You
implement:
 –  A
dic:onary
for
a
very
fast
spell
checker


The Distribution Code •  •  •  •  •  • 

texts
–
a
symlink
 speller.c
–
a
spellchecker
 dic:onary.h
–
the
header
file
 dic:onary.c
–
a
dic:onary
implementa:on
 Makefile
 ques:ons.txt


What to implement •  •  •  • 

load()
–
loads
a
dic:onary
into
memory

 size()
–
gets
the
size
of
the
dic:onary
 unload()
–
unloads
a
dic:onary
from
memory
 check()
–
checks
if
a
given
word
is
in
the
 dic:onary


Your options •  Slow
but
simple:
Linear
search
every
:me
 –  don’t
do
this!


•  Hash
tables
 •  Tries


Hash Tables

Image
courtesy
User:Davidgothberg
and
User:Helix84,
 Wikimedia
Commons


Hash Tables - Operations •  •  •  • 

Ini:alizing
our
hash
table
 Adding
dic:onary
words
 Checking
words
 Unloading
words


Ini:alizing
 //Here's
how
our
node
is
defined
 typedef
struct
node
{
 



char
word[LENGTH
+
1];
 



struct
node
*next;
 }
node;
 //We
have
our
main
directory
of
node





 //pointers
 node
*myarray[ARRAYSIZE];
 //
for
each
element
i
in
myarray:
 //




myarray[i]
=
NULL


Hash Tables – a Hash Function function
myHashFunction(string):
 
int
hashresult
 
foreach
character
in
string:
 
 
hashresult
+=
character
‐
97
 
return
hashresult
%
ARRAYSIZE


Loading
Dic:onary
Words
 •  fopen(dict)
same
as
in
pset5
 •  while
!feof(dict)

 –  create
nodes
for
them
 –  put
these
nodes
in
the
hash
table


Crea:ng
Nodes 

 •  malloc
space
for
new
node

(node
 *newnodeptr)
 –  store
each
leYer
i
of
the
word
in
that
node
 •  fgetc(dptr)
is
that
leYer
 •  newnodeptr‐>word[i]
is
where
leYer
should
be
stored
 •  un:l
you
reach
‘\n’
 –  newnodeptr‐>word[j]
=
‘\0’
at
this
point


Put
Node
in
Hash
Table
 •  hash(newnodeptr‐>word)
 –  go
to
that
place
in
array
(array[hashresult])
 •  if
nothing
is
there
(NULL)
 –  put
a
pointer
to
your
node
that
you
just
malloced
there

 –  set
newnodeptr‐>next
to
NULL


•  else
 –  set
newnodeptr‐>next
to
the
pointer
currently
there
 –  put
your
pointer
there


•  when
while
loop
exits,
fclose(dptr)


Size
 



Really
easy
if
you’ve
kept
a
counter
that
you
 increment
every
:me
you
load
a
word.


Check
 •  convert
each
leYer
of
word
tolower

 •  hash
word
and
go
to
that
place
in
array
 –  temporarily
store
the
ptr
you
find
there
 –  using
that
ptr,
traverse
the
linked
list
looking
for
 the
word
in
ptr‐>word


Unload
 •  Iterate
through
each
node,
like
in
check
 –  free
the
node
 –  free
the
spot
in
the
array
that
starts
the
linked
list
 –  return
true


•  run
valgrind
to
ensure
no
leaks!


Tries

Tries – A struct //Here's
an
example
of
what
each
node
 //in
our
struct
will
look
like
 typedef
struct
node
{
 



bool
is_word;
 



struct
node
*children[27];
 }
node;
 //We
have
our
root
node
 node
*root;


Tries - Operations •  •  •  • 

Ini:alizing
our
root
node
 Adding
words
 Checking
words
 Unloading
words


Tips •  Start
with
a
small
dic:onary
and
small
text
file
 (speller
[dict]
file)
 •  Mapping
out
data
structures
on
paper
and
on
 screen
 •  Using
gdb


Ques:ons?


CS 50 Walkthrough 6 - CS50 CDN

Slow but simple: Linear search every fme. – don't do this! • Hash tables. • Tries ... easy if you've kept a counter that you increment every fme you load a word.

222KB Sizes 1 Downloads 276 Views

Recommend Documents

CS 50 Walkthrough 5 - CS50 CDN
A bitmap is a series of consecuûve pixels described after each other. • Also has “metadata” in first 54 bytes consisûng of two headers.

CS 50 Walkthrough 6
create nodes for them. – put these nodes ... store each le›er i of the word in that node. • fgetc(dptr) is that ... put a pointer to your node that you just malloced there.

CS 50 Walkthrough 6
Topics: – More data structures, more pointers. – More File I/O. • You implement: ... convert each le›er of word tolower. • hash word and go to that place in array.

CS50 Walkthrough 1 - CS50 CDN
Free Candy. ▫ Time for Change. ▫ I Saw You ... Free Candy. ▫ Seriously, in the CS50 ... ask user for an integer printf("Give me an integer between 1 and 10: ");.

CS50 Walkthrough #3 - CS50 CDN
Go to middle if k < value at middle search for k between first and the one before the middle if k > value at middle search for k between one after the middle and last if k = value at middle return true. If you haven't found k after this loop, return

CS50 Walkthrough #3 - CS50 CDN
what type are these values? ▫ how do we initialize them? ▫ don't forget! ▫ swap tiles for even d ... Questions? Please email me feedback: [email protected].

CS 50 Walkthrough 5
A bitmap is a series of consecuûve pixels described after each other. • Also has “metadata” in first 54 bytes consisûng of two headers.

CS 50 Walkthrough 5
Data structures, hexadecimal, and pointers. • Programs: – whodunit. – resize. – recover ... Image recovery! ... Go through each block in the disk image and: 1.

CS 50 Walkthrough 5
Image recovery - Steps. • Steps: Go through each block in the disk image and: 1. If we find a JPEG signature, start wriûng the bytes out to another file. 2. If we find a new JPEG signature, close that old file and go back to 2. 3. If we find the E

52/cs50! - CS50 CDN
SSH. • Secure Shell. • Allows you to access another computer through command-‐line interface. • We use SSH to connect to the CS50 Cloud!

52/cs50! - CS50 CDN
A condi on may have two values: true or false. • May be expressed as a logical expression or a. 'bool' variable. • Can be thought of as a yes/no ques on, or a.

CS50 Walkthrough 4
function, takes one argument ch (ascii). ▫ if ch is 0, . , KEY_BACKSPACE, KEY_DC. ▫ set that spot in the board to 0. ▫ if ch is numerical between '1' and '9'.

CS50 Walkthrough #3
search. ▫ sort. ▫ fifteen.c. ▫ distribution code ... Re-implement as binary! ▫ why? ▫ 2 main ways. ▫ iterative. ▫ recursive. Page 6. Binary Search: Iterative. Go to middle.

CS50 Walkthrough 4
distribution code. ▫ ncurses. ▫ move cursor. ▫ allow changing user-added numbers, but not original ones. ▫ allow replacement of blank with number. ▫ invalid move? ▫ won? ... Moving the cursor. ▫ Switch statements! switch (test). { case

Merge Sort - CS50 CDN
Data stored in memory has both a value and a location. • Pointers contain the memory address of some piece of data. • * pointer contains address to a ...

pset4 - CS50 CDN
Oct 8, 2010 - Go ahead and execute the command below: hostname. Recall that cloud.cs50.net is actually a cluster of servers. That command tells you the name of the specific server in the cluster that you happen to be connected to at the moment. Take

CS50 Walkthrough 4
To Do. ▫ distribution code. ▫ ncurses. ▫ move cursor. ▫ allow changing user-added ... Allows you to change colors, ... g.board[g.y][g.x] is spot on board where.

Merge Sort - CS50 CDN
Data stored in memory has both a value and a location. • Pointers contain the memory address of some piece of data. • * pointer contains address to a ...

CS51 - CS50 CDN
We can still conceptualize & prototype using the right language abstractions. ▻ If we understand relationships between linguistic abstractions, we can realize ...

Untitled - CS50 CDN
http://developer.apple.com/library/ios/documentation/UserExperience/Conceptual/MobileHIG/Characteristics/Characteristics.html ... content="yes"> http://developer.apple.com/library/safari/documentation/appleapplications/reference/SafariHTMLRef/Article

Asymptotic Notation - CS50 CDN
Like searching through the phone book. • Identify ... as you go. If array[i + 1] < array[i], swap them! ... Grab the smallest and swap it with whatever is at the front of ...

Krzysztof Gajos - CS50 CDN
What you will learn in. CS 179. • Discover and understand people's latent needs. • Invent and construct prototypes. • Design for people different than yourself.

Problem Set 7: C$50 Finance - CS50 CDN
the Web, strangers, and friends, as you teach yourself more about HTML, CSS, PHP, and SQL, so long as your work overall is ... the course reserves the right to impose local sanctions on top of that outcome for that student that may include, but not b

CS50 Walkthrough #3
Go to middle if k < value at middle search for k between first and the one before the middle if k > value at middle search for k between one after the middle and ...