Bookmark this page

Lab: Using Regular Expressions with grep

In this lab, you will use regular expressions and grep with text files to locate requested data.

Resources:
Files: http://classroom.example.com/pub/materials/awesome_logs
Machines: serverX

Outcomes:

Follow the clues and help Dr. Zingruber recover the lost "artwork."

Dr. Zingruber: Hello! I hear you are the person to talk to about Red Hat Enterprise Linux systems administration help?

Yes, that is me.

Dr. Zingruber: Maybe you can help me, then; you are my last hope. Something terrible has happened. There has been a heist at the Museum of Awesome!

What was stolen?

Dr. Zingruber: It was a work by Wander van Gogh.

Wander van Gogh? Never heard of him.

Dr. Zingruber: I am not surprised. He is a descendant of Vincent van Gogh, but is far, FAR more insane. This is one of his most important pieces, which is why we have it at the Museum of Awesome.

I see. When was the piece taken?

Dr. Zingruber: It was August 8 sometime between 1:00pm and 3:00pm.

Wait, what? It was taken August 8, and you are just now investigating?

Dr. Zingruber: Yes, well, to be frank, no one really noticed it was missing until now. You see, while the piece was in the Museum of Awesome, it was in the Hall of Mildly Awesome. If you go to the Museum of Awesome, are you going to the Hall of Mildly Awesome or the Cavern of Supreme Awesome? Because of its placement, no one really looks at it, and just between you and me, it kind of creeps me out.

Um... Okay. So what else can you tell me about the heist?

Dr. Zingruber: Well, we do have a variety of logs for different things. You can download them from http://classroom.example.com/pub/materials/awesome_logs. I think the place to start your investigation would be door.log around the time of the event.

  • Reset your serverX system.

  1. Download the logs to your machine, and change directory to the logs directory.

    [root@serverX ~]# wget -r -l 1 -np http://classroom.example.com/pub/materials/awesome_logs
    [root@serverX ~]# cd classroom.example.com/pub/materials/awesome_logs
  2. Use grep to search through the door.log. Follow any further instructions you may find in the logs.

    [root@serverX awesome_logs]# grep '^Aug *8 1[345]' door.log
    [root@serverX awesome_logs]# grep '^Aug *8 14.*OPEN' door.log
    ... Output Truncated ...
    Aug  8 14:37:03 alarm_monitor activity: back door: OPEN Dr Zingruber: "Oh yes...
    Aug  8 14:40:01 alarm_monitor activity: back door: OPEN look here, you can see
    Aug  8 14:41:26 alarm_monitor activity: back door: OPEN the door stayed open.
    Aug  8 14:43:55 alarm_monitor activity: back door: OPEN Now that we know a more
    Aug  8 14:46:20 alarm_monitor activity: back door: OPEN exact time, we should
    Aug  8 14:48:31 alarm_monitor activity: back door: OPEN check wall.log for the
    Aug  8 14:51:30 alarm_monitor activity: back door: OPEN same period.
    ... Output Truncated ...
    [root@serverX awesome_logs]# grep '^Aug *8 14:[345]' wall.log
    [root@serverX awesome_logs]# grep '^Aug *8 14.*ALERT' wall.log
    Aug  8 14:37:03 alarm_monitor ALERT: Mildly Awesome: Dr. Zingruber: Ah, yes here
    Aug  8 14:40:01 alarm_monitor ALERT: Mildly Awesome: it is, looks like they
    Aug  8 14:41:26 alarm_monitor ALERT: Mildly Awesome: digitized the image. We
    Aug  8 14:43:55 alarm_monitor ALERT: Mildly Awesome: should check proxy.log at
    Aug  8 14:46:20 alarm_monitor ALERT: Mildly Awesome: 14:40.  The digitalized
    Aug  8 14:48:31 alarm_monitor ALERT: Mildly Awesome: image will be on the 24
    Aug  8 14:51:30 alarm_monitor ALERT: Mildly Awesome: lines following the log.
    [root@serverX awesome_logs]# grep -A 24 '14:40' proxy.log
    Aug  8 14:40:03 Outbound data Captured...Dr. Zingruber: You found it, thank you!
    ................................MMMMMMMMMMMMMMN~................................
    ............................:MMMMMMMMMMMMMMMMMMMMM?.............................
    ..........................DMMMMMMN88MMMMNZZZZMMMMMMMN...........................
    ........................+MMMMMMMZZZZZZZZZZZMM8ZMMMMMMM?.........................
    .......................MMMMMMMMZZZZZZZZZZZZZZZZZMMMMMMMM........................
    ..................... MMMMMMMMMZOMMMMMMZZZZZZZZZZMMMMMMMM ......................
    .................... MMMMMMMMMOZZZNZZZZZZZZZZZZZZMMMMMMMMM .....................
    ....................MMMMMMDDDNMZZZZZZZZZZZZZZZZZZZMMMMMMMMM.....................
    ...................+MMM$ZZZZZ8MMMZZZZZZZZZZZZZZZZZMMMMMMMMM~....................
    ...................MMMMZZZZZZZDMMMMMMMNZZZZZZZZZZZNZ8MMMMMMM....................
    ...................MMMMOZZZZZZZZZMMMMMMMZZZZZZZZZZZZZZZDMMMM....................
    ..................,MMMMMMZZZZZZZZZZZZMMMMZZZZZZZZZZZZZZZZMMMZ...................
    ..................DMMMMMMMMZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZMMMD...................
    ..................ZMMMMMMMMMMDZZZZZZZZZZZZZZZZZZZZZZZZZZMMMMN...................
    ..................,MMMMMMMMN..:MNZZZZZZZZZZZZZZZZZZZZZZMMMMM7...................
    .................. MMMMMMMM?... M,MMMZZZZZZZZZZZZZZZZMMMMMMM ...................
    ...................MMMMMN7MMI..... MMMMMMMMNNNNNMMMMMMMMMMMM....................
    ...................    ....7MM ..... IDZ=..$, =I MMMMMMMMMM.....................
    .................................................MMMMMMMMMM.....................
    ........................................IMMM,...NMMMMMMMMM Z88..................
    ................................................MMN......... ...................
    ..............................................IMD...............................
    ............................................ $M.................................
    ................................................................................
    

    Dr. Zingruber noted that the theft occurred between 1:00 p.m. and 3:00 p.m., or in 24-hour format (used by logs), 13:00 to 15:00. Our regular expression should use the date and hour field of the time to get the relevant entries.

    Note that there are TWO space characters between Aug and 8 in the log file date format. To address this, you can use two spaces in your regular expression or a multiplier on the space character. You may have to look through the matched data to find what you are looking for. If you cannot, try the following:

    In the door.log entries, we were referred to the wall.log file, but we now have a more narrow time. Use grep to look between time codes 14:37 and 14:51.

    Note that, again, there are TWO space characters between Aug and 8 in the log file date format. To address this, you can use two spaces in your regular expression or a multiplier on the space character. You may have to look through the matched data to find what you are looking for. If you cannot, try the following:

    In the wall.log entries, we were referred to the proxy.log file, but we now have an exact time. Use grep to look between time code 14:40. Additionally, not only do we want the line for time code 14:40, but also the 24 lines that follow that log entry.

    You should now have recovered the "artwork."

Revision: rh134-7-c643331