ITECS Systems Staff Notes

Sunday, November 22, 2009

Billy: November 16-22

journal entry posted by wrbeaudo on November 22, 2009 11:39pm
  • Catching Up - Reading email, returning voicemail, and generally just getting back up to speed after being out for 3 weeks.
  • AD - Some random fixes in the ENGR, Wolftech, and IES domains.
  • VMware - more work on pricing model and discussion about cluster sizing/design.

Meetings:

  • BigFix Demo
  • Architect Interview
  • Backups/Nagios
  • Logging
  • AD Policy
Friday, November 20, 2009

Gary - 11/9 to 11/20

journal entry posted by gsgatlin on November 20, 2009 5:47pm

I finished making the current patchlist for Solaris 10. I created a rpm
to get around a bug in patch 139555-08. I shared my changes
with the solaris 10 mailing list.

I worked on remedy calls.

I went to a meeting about splunk and the new log server.

I helped Lance Mangum and Dr. Hassan in MAE with a problem with RAM
in a workstation going bad.

I told Mark Barefoot I could not help him with his Linux lab problems
because they are really network problems. I suspect he has a bad
wall port in his Linux lab.

Worked with OIT on the changes needed for our AFS servers.

I upgraded our AFS install docs with the newest correct info
so that Richard and or Daniel can configure an AFS server after
a fresh install.

I assisted early Thursday morning with a AFS file server failure and
assisted in getting the server back up and diagnosing that it was in
fact a bad hard disk that caused the outage.

I found a possible problem with the connection to the SAN on
engr11f on Friday.

I helped Justin with new lab hardware. I have installed Realm Linux
on a DELL Optiplex 390 ITECS has bought. The install went ok.

I sent our backup spreadsheet to Shawn McIntosh in OIT. I am still
waiting to hear something from them.

I fixed a problem on all Linux lab machines where they were not seeing
all the print queues in a lab. Now every lab has every wolfcopy
printer in a building show up the print dialog. I made a lot of
changes to the set-printer script to make that work.

I moved a lot of AFS volumes around because of upgrades.

I finished upgrading only a single server, engr06f. We will work
on upgrading the other servers next week.

Rob's Week

journal entry posted by rfgrau on November 20, 2009 5:09pm

Mediasite
-investigate quorum drive strangeness / RE: backups
-work on sql storage usage estimate script
-revise Mediasite service plan draft
-fix group policy that was keeping us from running cron jobs
-open firewall for VCS operators to use Mediasite Editor

Other
-assist Michael w/ nagios
-install Windows 7 x64 from WDS w/ Michael
-talk to Shawn M of OIT about OIT backups
-splunk demo

Monday, November 16, 2009

ddballar: (11/09/2009 - 11/15/2009)

journal entry posted by ddballar on November 16, 2009 10:15am

Finished up deploying a file server in WolfTest domain. Futzed with DFS links to shares on it that are not working quite like i expect. Windows 2008 with UAC behaves very differently when traversing directory structures, so that took some getting used to in addition to playing around with UAC settings to make it behave "normally and rationally" again.

Attempted to deploy the Nagios software package in WolfTest, and configure the respective group policies.

Configured a group policy to patch and reboot the WolfTest domain controllers unattended.

Some backup scripts work.

Started reviewing platform specific group policies.

Attended ITECS staff meeting.

Meeting with Joe Sutton (IES) regarding use of OIT Celerra instance as a Windows file server replacement. Demonstrated how to create and permission a share on a Celerra file server. Also did some AD replication troubleshooting for their domain.

Daniel, 11/9 - 11/15

journal entry posted by dssink on November 16, 2009 9:40am
  • Moved people to vlan 30
  • Set up an interface to test arp table problems in vlan 30
  • Tweaked boyette's lockers, they should be straight now
  • Set up a splunk VM to give Pete
  • Handled an OOM-killer incident and a few other things while on call, nothing terribly serious

Michael: 11/9-11/13

journal entry posted by mpunderw on November 16, 2009 9:37am

- Bentley license server reconfig
- Lab image new software
- Adding drivers to Win7 boot image
- Opnet
- help Justin move to Solar Center people to wolftech

Friday, November 13, 2009

Richard 11/7 - 11/13

journal entry posted by rsmclane on November 13, 2009 7:06pm

Activities:

  • I believe I have all the VMware permissions working how we want now, will have Meimei test the Cluster Admin patching hosts next time host patches are available
  • Patched both VMware clusters
  • Demonstrated the ARP issues in vlan 30 for Comtech. They are researching further
  • Shifted around the storage for a number of VMs to facilitate some datastore adjustments
  • Wrote up some documentation on Storage Motion

Meetings:

  • VMware Team meeting
  • ITECS Staff meeting
  • Migration meeting with IES
  • IAM Team meeting

Rob's Week

journal entry posted by rfgrau on November 13, 2009 4:53pm

Mediasite
-prep for extending Mediasite storage
-successfully extend storage space
-brainstorm for Mediasite service parameters
-ADAM / AD LDS backup configuration
-upgrade test environment to 5.2
-Resolved problem with installing SQL 2005 hotfix in cluster
-Work on SQL space monitoring

Other
-assist Michael w/ Microstation license server issues.
-ITECS Staff Meeting
-Majordomo administration for Dean's Office

Tuesday, November 10, 2009

ddballar: (11/02/2009 - 11/08/2009)

journal entry posted by ddballar on November 10, 2009 9:57am

Worked on setting up DFS server infrastructure in WolfTest domain. Installed wtest-00-dfs, and working on setting up a test file server. Ran into technical issues getting it configured. Trying to work thru those issues, and learning a little about WDS a bit in the process.

Did some troubleshooting with Barak in IE; he was having WolfTech DC name resolution issues, and i wanted to see what that was about.

Consulted with Joe Sutton in IES(?) regarding Windows file server in the colleges cluster.

Wrote up draft of steps to upgrade WolfTech DCs to Windows 2008/R2.

Troubleshooting on some domain controller backup scripts. Want to test some different options for configuring this in WolfTest, but don't have a good storage location yet. Arg.

Monday, November 09, 2009

RIchard 10/31 - 11/6

journal entry posted by rsmclane on November 9, 2009 7:20pm

Activities:

  • Follow up with OIT over what happened with the SAN
    • Apparently they were doing some maintenance, SAN link failover should have happened, investigating further
    • May need Solaris kernel patches to be in sync with the PowerPath software
  • Worked with Daniel to get the last bits of mail corrected after the crash
  • Talked with OIT to get Gary the information he needed to get Solaris patch access
  • Locker stuff: phpGacl access for some CWSes, worked with Daniel on some major research locker shuffling, other remedy responses
  • Worked on VMware permissions

Meetings:

  • Sr-Network-Architect Interview
  • AITD meeting
  • CLS meeting
  • EDUCAUSE presentation on Green-IT
  • EDUCAUSE presentation on cost/management considerations

Michael - 11/2-11/5

journal entry posted by mpunderw on November 9, 2009 9:33am

- worked on AFS client for Windows 7
- finalized admindesktop
- worked on Groupwise for Windows 7
- Outlook becoming default IMAP client for admindesktop
- customized boot image for Derek
- lab sleep problem
- worked on reorganizing admin shares

Friday, November 06, 2009

Gary - 11/1 to 11/6

journal entry posted by gsgatlin on November 6, 2009 5:51pm

I worked on remedy calls.

I worked on the file server outage on Sunday night.
I helped with the restore from tape on engr00mb on Monday.
I did troubleshooting on the file server to try to determine what went wrong.

I helped with a Realm CRON job that showed that only engr01jab.eos.ncsu.edu
is affected by the missing disk problem in VMware ESX.

I spent time trying to get the O.N. for Solaris patches. I did not have
luck but Richard was able to get them over the phone for me.

I tried to patch Solaris 10 with the newest security patches. But
so far I have had little luck. My security patchlist only install
produces a system that will not boot. I will try downloading more
patches next week to see if the recommended patches fix this no boot
problem. We are still waiting to hear back from OIT if they suggest
any specific patches. It seems we are the only ones on campus using
Solaris 10 for file servers. Evidently OIT is still using Solaris 8
for all AFS servers. So it may take a bit for them to figure out what
we may need to add. But for now I am using sunsolve to try to get the
machines pactched up to the current release of Solaris in case that helps.
(It also will make the machines more secure)

I upgraded kernels on all Linux lab / vcl / desktop type systems.

I upgraded kernels on all Linux ESX boxes I am responsible for. I also
removed the clock=pmtmr argument from the grub.conf file on all ESX
servers I upgraded.

I went to a CLS services meeting. I am supposed to work with Jack Neely
to make some VMware features we use be a part of Realm Linux globally in
the near future.

I created a "findutils" rpm for Solaris 10. This was to add the "locate"
command on Solaris 10.

I finished all the new proe packages for Solaris 10.

I tried to contact Steven Stewart in OIT about them taking over our
backups. He still hasn't replied to either of the emails I sent him.

I nagged OIT about the printing issues again this week and they FINALLY
pushed out the update...

Rob's Week

journal entry posted by rfgrau on November 6, 2009 1:46pm

Mediasite
-worked on SAN backups w/ Steven
-worked on ftp setup on new EX server
-investigate SQL hotfix upgrade errors.
-worked on AD LDS backups

Other
-Mediasite Planning Meeting
-Discussed license log moving w/ Michael
-Email discussion w/ Michael

Daniel, 11/2 - 11/6

journal entry posted by dssink on November 6, 2009 11:55am
  • mon/tues
    • worked to restore mail server
  • wed
    • cls meeting and fixing boyette's research lockers
  • thurs
    • 2 educause sessions
    • provisioned a new vhost
    • worked more on boyette's lockers
    • discovered jabber server drives were ro and worked on a script to search out other machines with the same problem
  • fri
    • fixed a problem with the vhost i created
    • turns out that script won't work, cron can't create a temp file to send mail if it finds a ro partition
    • rebooted the jabber server to clear error conditions and applied the new cert
Tuesday, November 03, 2009

Richard 10/24 - 10/30

journal entry posted by rsmclane on November 3, 2009 8:24am

Activities:

  • Completed Engineering vCenter maintenance and documentation
  • Worked with Daniel on mail server log issue and resolution
  • Worked with Gary on tape rotation issue that I was responsible for
  • Worked out some logistics in 110 Poe with Design

Meetings:

  • VMware team meeting
    • Discussed vCenter backend DB
  • Met with IES about VMware migration roadmap
    • Should be getting us the first two servers the week of Nov 9th
    • Planning on being off all old servers by Christmas
    • Possibly another 2950 available then
  • Webinar on Groundwork Monitor 6
  • Webinar on VMware Orchistrator
  • View naming standards with the Web Team
    • Damian should be writing it up in the helpdesk wiki
  • webhosting pilot meeting
Monday, November 02, 2009

ddballar: (10/26/2009 - 11/01/2009)

journal entry posted by ddballar on November 2, 2009 11:10am

Started looking at reworking the domain level baseline policies. To this end, i searched for a (preferably free) tool that would diff two group policies, as this would make things a bit easier. Tried one, but gave errors when used. Also came across "Microsoft Advanced Group Policy Management (AGPM) that helps you better manage Group Policy objects (GPOs) in your environment by providing change control, offline editing, and role-based delegation." Downloaded and read up on this a bit, perhaps for a future discussion.

Page 5 HP4100 printer seems to be failing. Worked on diagnosing the error on the LED display, and recearch indicates it might be a bad ROM SIMM. Tried to figure out if there was a way to flash the firmware; no luck. Looked at current HP offerings for a replacement printer.

Worked on DC full backup scripts that backup to a network location. Still not working right.

Worked with Billy to get the Nagios monitoring client installed on hte WolfTech domain controllers, and configured the OIT Nagios server to monitor them. Took a look at the 1.4 version of the "check_ad.exe" component of the Nagios client package, and asked mpunderw to build me another msi with the updated client in it. Want to test this in the WolfTest domain before trying in WolfTech.

Patched print servers.

Michae: 10/26-10/30

journal entry posted by mpunderw on November 2, 2009 9:36am

- checked on Solidworks and Windows 7 compatibility
- tested all lab software against Windows 7
- worked on trying to get a new Windows 7 compatible AFS client working
- admindesktop migration
- help Sherry in BAE track down a software deployment issue

Friday, October 30, 2009

Daniel, 10/26 - 10/30

journal entry posted by dssink on October 30, 2009 6:13pm
  • Fixed the log problem on engr00mb. Turns out postmaster was over quota and it was bouncing stuff like crazy. Created a quota monitor cron job on the local box to check for any over 95% and email me, Richard, and sysrootmail with a warning.
  • Elockers and server write stuff as usual.
  • Watched an interesting Groundwork Monitor webcast.
  • Talked to Darren from Comtech about logging/splunk and the presentation in RTP in a few weeks. Going to gather some data from him about our daily log size and see if we want to join in on a splunk license.

Gary - 10/26 to 10/30

journal entry posted by gsgatlin on October 30, 2009 3:10pm

I changed the backup tapes.
I worked on Remedy calls.

I did a lot more testing with the VMware VMXNET 2 driver disks. All ISOs
on the ESX and ESXi servers have been replaced with my custom ISOs with the
proper drivers on them. Everything is ready to create new VMs with VMXNET 2
on Realm Linux.

I changed the "batch-ping" command so that it now pings a group of
machines twice. The first sweep is to "wake up" all the PCs in the list
and the second is the actual ping used in emails / output. This change
was requested by Justin Lancaster and is now live.

I did some research on power saving on Linux desktop systems and I also
consulted with Jack Neely on the problem as it was presented to me. He
is going to give this problem more thought for RHEL 6.

I created the openoffice-nautilus-integration rpm for Solaris 10 to
fix a bug with nautilus not knowing how to start a openoffice icon on
the desktop.

I created the openpkg-perlfix rpm to fix a flaw with the openpkg system
not being able to package perl scripts. This fix is required to
re-package the proe wildfire rpms.

I upgraded the license managers data file on engr14lic and engr15lic at the
request of Robbie Little to add a new feature they need.

I started working on a new set of packages for proe wildfire. I am trying
to get openpkg to work with "subpackages" which would be a more proper way
to package this massively huge app.

I tried to get OIT to push out a fix for the Linux printing problems. It
looks like OIT still has not pushed that rpm out to the labs. I did
determine that the new version of cups2lprng fixes the problems.

Rob's Week

journal entry posted by rfgrau on October 30, 2009 12:03pm

Mediasite
-Updated Firewall for Biltmore live presentation next week
-Discussed recorder for History department w/ DELTA
-Troubleshooting History recorder connection to instance
-Reinstalled Media Server Control Service on engr99wms
-Set up IES for Greensboro publishing test.
-fixed authentication timeout in management portal (now 120 minutes)
-Got to the bottom of particularly annoying problem: Mediasite not recognizing Wolftech group membership of a newly created role.
-ran storage space statistics
-worked on Veritas NetBackup w/ Steven for clustered files

Other
-ITECS Managers Meeting
-COE Computer Committee
-Wolfwise Town Hall

Monday, October 26, 2009

Daniel, 1018 - 10/24

journal entry posted by dssink on October 26, 2009 10:27am
  • Reinstalled laptop with Win7 Pro, had Billy join it to the domain
  • Worked on futureshock/rhel5 web kit
  • Processed lockers and web write requests
  • Engr-sysadm meeting
  • Assisted Pete with some log server stuff
  • Played around with OIT's splunk setup

Michael: 10/19-10/23

journal entry posted by mpunderw on October 26, 2009 9:29am

- tried to get groupwise installed on Windows 7
- tested whether or not lab programs would run in Windows 7
- worked on Admindesktop migration
- looked at default domain policy for Windows 7

ddballar: (10/12/2009 - 10/25/2009)

journal entry posted by ddballar on October 26, 2009 8:58am

Looked at some suspect GPOs noted by the output of gpotool jaklein ran.

Wrote a batch file that uses several of the GPMC scripts to gather various information about, and then backup the WolfTech domain group policies. Also worked on a script that performs a full system backup of a domain controller to a network share. Still need to finalize the expiring of system backup jobs, though.

Discussed DC backups with Billy; sent out some notes to the WolfTech domadmin list.

Worked with Billy to install and configure the Nagios client (packaged by jaklein) onto the WolfTech domain controllers. Went thru how to setup client on the OIT server monitoring webpages.

Worked on a LogParser script that will record reboots and shutdowns of key WolfTech central servers.

Worked on a LogParser script to gather the names of those computers "in" the domain with broken trusts.

Tweeked the WolfTech DC Health Report script.

Saturday, October 24, 2009

Billy: October 18-23

journal entry posted by wrbeaudo on October 24, 2009 12:46am

I'm gone for the next 3 weeks in a foreign land full of wonderment and shenanigans. Sucks to be everyone else.

  • TMOS 10 - Got a gpo setup, it will update v8, but its talking to the wrong parent server.
  • ENGR Cleanup - Deleted krb_disable user accounts and removed a couple more GPO/OU/groups.
  • Project Planning - Went around to everyone in Systems and a couple people on the second floor to discuss what would be going on for the next couple weeks to help them prioritize and to give feedback on anything that would normally require my assistance/input/etc.
  • Networking - Troubleshooting Celerra access issues, going through QIP DB dump to try and distinguish VLANs from subnets, monitoring the DC's via OIT's nagios on Sysnews.
  • Podcasts - Got some MS, FLOSS, VMware, Security podcasts to fill up the 60+ hours of travel time I'll have to/from New Zealand.

Meetings:

  • IAM Roadmap
  • Solar Center -> Wolftech
  • Engr-Sysadmin
  • Domain Controller Monitoring
  • AD Policy Committee
  • Project Planning x 6
  • Network Architect Interviews x 3
Friday, October 23, 2009

Gary - 10/11 to 10/23

journal entry posted by gsgatlin on October 23, 2009 3:49pm

I worked on remedy calls.
I changed the backup tapes.
I went to the Engineering sysadmin meeting.

I pushed out a new openoffice.org to the labs. The default version users
see is now dependent on which openoffice-local rpm is installed rather
than which version in AFS is the default. The app does better apploging on
Linux now. Most of the time it will show oocalc or oowriter instead of
the generic ooffice as the app that was launched by the user.

I created a new rpm called openoffice-local-bin which does a lot of the
work. Both rpm packages will work on a stock RHEL 5 or CentOS 5 box. Thus,
I have made the repository world readable. I also created documentation
on this on the techies wiki here.

The packages I built for openoffice.org 3.1.1 are much better than earlier
versions of the rpms I had created. They do all the work in the %build
section rather than the %post section.

The better openoffice-local package now will install on Solaris 10
at install time rather then needing to be added post install. I'm pretty
sure this is because the packages are more properly built now.

I started working on better versions of the rpms for proe wildfire so
that they will also be able to install at install time on Solaris 10.
The rpms are working but they have exposed some other bugs in Solaris 10.
So this isn't ready to share with the Solaris 10 list yet. I will need
to spend more time on this problem next week.

I finished all the ESX VMXNET 2 RHEL 5 Realm Linux boot ISOs and I moved
them onto each ESX box. I created a method to build the VMware tools
modules post install since we don't have AFS on first boot.

I still need to do a bit more testing but this is basically finished now.

I updated this article which describes different issues that can come up
running Realm Linux within VMware. The article had become out of date.

I modified the include "ESX-vmconfig" within RHEL 5 so that it no longer
adds "clock=pmtmr" to grub.conf since its no longer required. I also
changed this file for RHEL 4 to add the "divider=10" parameter in
case we ever do any more RHEL 4 installs.

I created a spreadsheet with everything I think OIT needs to begin taking
over our backups. I am waiting to hear back from Steven Stuart in OIT to
see what the next steps we need to take are.

Powered by ewe