Just about every week, I put together the Windows Client
Developer Roundup. This consists of a bunch of links to posts that
deal with topics of interest to client developers. One of the more
time-consuming tasks in that work is simply copying and pasting the
link information into LiveWriter. When I open the URL in IE, I have
to manually copy the link and then, either copy and paste special
on the title (which is annoying, error prone, and includes a dialog
with a setting I have to choose each time), or because I type fast,
just retype the post title.
At the same time, I have to clean all the FeedBurner gunk off
the end of the URL, because links clicked in the roundup shouldn't
show up as towards FeedBurner traffic (that's a debate for another
time). An example FeedBurner URL with tracking looks like this:
/blog/2011/02/17/asynchronous-web-and-network-calls-on-the-client-in-wpf-and-silverlight-and-net-in-general?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+PeteBrown+%28Pete+Brown%27s+Blog%29
It doesn't seem like a lot of work, but go ahead and click on
that link. Now, in the address bar, click and select just the
portion of the URL up to the question mark. Notice all the extra
clicking you have to do (or click, then home then move the cursor
etc.)? Do that 25 times and you see how it gets tedious.
I've been wanting to write an IE addin to copy and clean this
for me, but didn't get around to it until today. It took me almost
all day to write this add-in and this blog post, so blame this on
the Windows Client Developer Roundup being a day late this week
:)
Why did it take so long? I wanted to do this in C++, and my C++
is slightly more rusty than an old tractor left in the back field
for most of the century.
Now, you can actually accomplish creating a basic add-in simply
by storing a little hunk of script on the drive and adding the
right registry keys. However, I specifically wanted to do this in
C++. No, not because I hate myself, but because I'm starting to see
a resurgence of interest in C++. You can create add-ins using .NET
and Script, but both have significant limitations as well as
performance concerns. If you want to write an add-in of any
complexity, you'll almost certainly want to write it in C++. So,
that's what I decided to do.
These two articles were absolutely required reading to figure
out how to do this add-in.
Now, on to the project. I used Visual C++ inside Visual Studio
2010 for this project. I also used Internet Explorer 8 on Windows
7.
Project Setup
Visual C++ projects are typically created using wizards. Create
a new ATL Project using the ATL Project wizard.
Make sure you check off "Allow merging of proxy/stub code" or
you'll get compile errors. Also ensure you're creating a DLL.
Create the Add-in Class
Next, we'll need to create a class in our project which will be
used to expose the functionality of the toolbar button. For that,
we'll use the ATL Simple Object wizard.
I called the class BlogUrlSnaggerAddIn. Everything else, except
the ProgID, was filled in for me. Leave that blank.
Leave the next page with the default (disabled) values. The
final page lets you set some of the options. Be sure to select
IObjectWithSite so you can work with IE. I also changed
"Aggregation" to "No" like the MSDN example.
Now, without changing anything, build the project. What? You got
a bunch of errors because the component couldn't be registered?
Ahh, you'll need to exit Visual Studio then run it as Administrator
or enable per-user redirection. To enable per-user redirection,
right-click the main project and select properties. Then view the
Linker General property page. The setting is there; change it to
"Yes". (don't do this now, see note below)
Unfortunately, the registration we'll be using won't support
that, so while it will get you past the initial compile,
the later rgs additions will require that you run in
administrator mode.
So, once you exit Visual Studio and re-start in Administrator
mode, you'll be able to rebuild and have a base project with no
errors.
Adding in IOleCommandTarget Declarations
As the MSDN document instructs us, we'll need to add in
references to a couple header files (shlguid and mshtml) as well as
add support for the IOleCommandTarget interface. I love that I get
intellisense on the include files.
// BlogUrlSnaggerAddIn.h : Declaration of the CBlogUrlSnaggerAddIn
#pragma once
#include "resource.h" // main symbols
#include <ShlGuid.h>
#include <MsHTML.h>
...
// CBlogUrlSnaggerAddIn
class ATL_NO_VTABLE CBlogUrlSnaggerAddIn :
public CComObjectRootEx<CComSingleThreadModel>,
public CComCoClass<CBlogUrlSnaggerAddIn, &CLSID_BlogUrlSnaggerAddIn>,
public IObjectWithSiteImpl<CBlogUrlSnaggerAddIn>,
public IDispatchImpl<IBlogUrlSnaggerAddIn, &IID_IBlogUrlSnaggerAddIn, &LIBID_BlogUrlSnaggerLib, /*wMajor =*/ 1, /*wMinor =*/ 0>,
public IOleCommandTarget
{
public:
CBlogUrlSnaggerAddIn()
{
}
DECLARE_REGISTRY_RESOURCEID(IDR_BLOGURLSNAGGERADDIN)
DECLARE_NOT_AGGREGATABLE(CBlogUrlSnaggerAddIn)
BEGIN_COM_MAP(CBlogUrlSnaggerAddIn)
COM_INTERFACE_ENTRY(IBlogUrlSnaggerAddIn)
COM_INTERFACE_ENTRY(IDispatch)
COM_INTERFACE_ENTRY(IObjectWithSite)
COM_INTERFACE_ENTRY(IOleCommandTarget)
END_COM_MAP()
I've highlighted the new lines in the BlogUrlSnaggerAddIn.h file
snippet above.
IObjectWithSite and IOleCommandTarget
Right after the "END_COM_MAP()" statement in the .h file, add in
the following code to define the interface methods for
IObjectWithSite and IOleCommandTarget (I copied this right out of
the MSDN article)
public:
// IObjectWithSite
STDMETHOD(SetSite)(IUnknown *pUnkSite);
// IOleCommandTarget
STDMETHOD(Exec)(const GUID *pguidCmdGroup, DWORD nCmdID,
DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut);
STDMETHOD(QueryStatus)(const GUID *pguidCmdGroup, ULONG cCmds,
OLECMD *prgCmds, OLECMDTEXT *pCmdText);
private:
void GenerateReport(BSTR filename);
BSTR GenerateReportStylesheet();
BSTR DoImageReport(IHTMLDocument2* pDocument);
private:
CComPtr<IWebBrowser2> m_spWebBrowser;
CComQIPtr<IOleCommandTarget, &IID_IOleCommandTarget> m_spTarget;
public:
If you leave out the final "public", this will mess up the scope
used in the DECLARE_PROTECT_FINAL_CONSTRUCT macro, so make
sure you add the "public" at the end of the code
above.
Next, we'll actually implement the code for the SetSite method
we added to the .h file.
Storing the Browser Reference using SetSite
The next hunk of code we need to add is to the
BlogUrlSnaggerAddIn.cpp file. It implements the SetSite method to
grab an instance of the browser for us to use in the other methods.
This is primarily copied from the MSDN example, with the class name
changed.
STDMETHODIMP CBlogUrlSnaggerAddIn::SetSite(IUnknown *pUnkSite)
{
if (pUnkSite != NULL)
{
// Cache the pointer to IWebBrowser2
CComQIPtr<IServiceProvider> sp = pUnkSite;
HRESULT hr = sp->QueryService(IID_IWebBrowserApp,
IID_IWebBrowser2, (void**)&m_spWebBrowser);
hr = sp->QueryInterface(IID_IOleCommandTarget,
(void**)&m_spTarget);
}
else
{
// Release pointer
m_spWebBrowser.Release();
m_spTarget.Release();
}
// Return base implementation
return IObjectWithSiteImpl<CBlogUrlSnaggerAddIn>::SetSite(pUnkSite);
}
Next, we need to implement the QueryStatus method. We'll also
stub out the Exec method for us to fill in later. These go in the
same .cpp file
STDMETHODIMP CBlogUrlSnaggerAddIn::Exec(
const GUID *pguidCmdGroup, DWORD nCmdID,
DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut)
{
// This is the method where all the action happens
return S_OK;
}
STDMETHODIMP CBlogUrlSnaggerAddIn::QueryStatus(
const GUID* pguidCmdGroup, ULONG cCmds,
OLECMD prgCmds[], OLECMDTEXT* pCmdText)
{
int i;
// Says we can do anything
// Shamelessly snagged from Eli's blog post
for (i=0; i<((int) cCmds); i++)
prgCmds[i].cmdf = OLECMDF_SUPPORTED | OLECMDF_ENABLED;
return S_OK;
}
Note that just like Eli did in his example, I went ahead and
indicated that we support all commands. You may want to be more
selective in your own implementation.
Anyway, with that in place, the next step is to add in a little
bit of registration
Adding the Registration Code
Now, you need to dig up the CLSID for your class. That is stored
in this case in the generated "BlogUrlSnagger_i.h" file with the
line that looks like this:
#ifdef __cplusplus
class DECLSPEC_UUID("4E17A214-3A1E-44CE-ACA5-09965A675359")
BlogUrlSnaggerAddIn;
#endif
#endif /* __BlogUrlSnaggerLib_LIBRARY_DEFINED__ */
There's another GUID for the IDispatch interface which is also
useful. However, both are pre-populated for us in the .rgs file
we'll need to edit (I just wanted you to see one place where they
are defined -- they're also duplicated in the .IDL file, and code
in hex form in the _i.c file. Ugh)
Crack open the BlogUrlSnaggerAddIn.rgs file in the ResourceFiles
folder. You'll see it has been pre-populated with the class name
and the main CLSID and the TypeLib CLSID. Excellent. Now we need to
modify that to be appropriate to our class. Don't simply
cut and paste my code below, these GUIDs are Mine MINE! Get your
own stinkin' GUIDs!
HKLM
{
NoRemove SOFTWARE
{
NoRemove Microsoft
{
NoRemove 'Internet Explorer'
{
NoRemove Extensions
{
ForceRemove {4E17A214-3A1E-44CE-ACA5-09965A675359} = s 'BlogUrlSnaggerAddIn Class'
{
val 'Default Visible' = s 'yes'
val 'ButtonText' = s 'Copy Clean URL'
val 'CLSID' = s '{1FBA04EE-3024-11d2-8F1F-0000F87ABD16}'
val 'ClsidExtension' = s '{4E17A214-3A1E-44CE-ACA5-09965A675359}'
val 'Icon' = s 'C:\Program Files\MyApp\foo.ico'
val 'HotIcon' = s 'C:\Program Files\MyApp\foo_hot.ico'
}
}
}
}
}
}
This block is in addition to the registration block already in
the rgs file; don't replace the old one with this one, simply add
this one below it. Also keep in mind that all those NoRemove
statements are pretty darn important. You don't want to hose your
registry when playing with this little project.
The settings are:
Setting |
Description |
Default Visible |
Set to Yes if you want this to be
visible by default |
Button Text |
The text you want to have appear for
your button |
CLSID |
{1FBA04EE-3024-11D2-8F1F-0000F87ABD16}
(see notes in the MSDN article) |
ClsidExtension |
The CLSID from your
add-in class |
Icon |
Full path to the icon file. Can also
be a path and resource ID. |
HotIcon |
Path to the icon file for the
hover/selected icon. |
There must be some way to resolve that Icon and HotIcon to the
runtime installation folder. I'm not sure what it is, though.
If you know, please tell me in the comments
below.
Finally, go into the .rc file and change the CompanyName,
FileDescription, LegalCopyright, and ProductName to something that
makes sense.
Phew! That's it for the setup.
Adding in Test Functionality
Much like Eli did in his post, I'm going to do a simple "Hello
World" test before I try to implement the real functionality. In
classic first-timer fashion, this is going to simply display a
message box that says "Hello World!". Crack open that add-in .cpp
file again and modify the Exec method so it includes a call to
MessageBox as shown here
STDMETHODIMP CBlogUrlSnaggerAddIn::Exec(
const GUID *pguidCmdGroup, DWORD nCmdID,
DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut)
{
// This is the method where all the action happens
MessageBox(NULL, _T("Hello, World!"), _T("Greetings"), 0);
return S_OK;
}
Now time to test! Do a build and then open up Internet Explorer.
You'll see your add-in in the toolbar. If your icon path is invalid
(like mine is) you'll see a little gear icon, at least in IE8.
(A message box! Dig the retro button on that message box. That's
another post for another day)
Adding in the Actual Functionality
Now, we get to the actual functionality. Just about everything
up to this point was just ceremony. Now I need to actually learn
how to interface with the browser page.
The interface we'll start with is IWebBrowser2.
You can find the specs for this interface in MSDN here. The two
functions I'm interested in are get_LocationURL and
get_LocationName. I see both take a pointer to a BSTR they fill
with the result.
Remember, I'm pretty new to C++. I know what a BSTR is, but I
realized I had no idea how to allocate and free one. Luckily, we
have some help there too.
Another Test
Once I knew what to do to release the string, I expanded the
test code to grab the location URL and location Name, and display
it in a MessageBox.
STDMETHODIMP CBlogUrlSnaggerAddIn::Exec(
const GUID *pguidCmdGroup, DWORD nCmdID,
DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut)
{
BSTR locationUrl;
BSTR locationName;
m_spWebBrowser->get_LocationURL(&locationUrl);
m_spWebBrowser->get_LocationName(&locationName);
MessageBox(NULL, locationUrl, locationName, 0);
::SysFreeString(locationUrl);
::SysFreeString(locationName);
return S_OK;
}
When run, the application now displays the following
MessageBox
Excellent! That's exactly what I was looking for. The next step
is to copy it to the clipboard.
Copying to the Clipboard
I want to copy both the page URL and the page title to the
clipboard. The natural choice for
format here is HTML. It took me a bit of piecing together from
various examples I found in searching. I also happened
upon a better way to handle BSTR instances, without worrying
about managing their allocation. Finally, I figured out what string
type to use for regular string manipulation: ATL::CString.
Using that requires adding a #include <atlstr.h> to
the top of your BlogUrlShaggerAddIn.h file.
STDMETHODIMP CBlogUrlSnaggerAddIn::Exec(
const GUID *pguidCmdGroup, DWORD nCmdID,
DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut)
{
// This is the method where all the action happens
//MessageBox(NULL, _T("Hello, World!"), _T("Greetings"), 0);
CComBSTR locationUrl;
CComBSTR locationName;
m_spWebBrowser->get_LocationURL(&locationUrl);
m_spWebBrowser->get_LocationName(&locationName);
//MessageBox(NULL, locationUrl, locationName, 0);
// register the HTML Format clipboard data format
static int CF_HTML = 0;
if(!CF_HTML) CF_HTML = RegisterClipboardFormat(_T("HTML Format"));
// Build the HTML String. Yeah, I can probably use the Format* methods and
// this approach is likely very inefficient. My string processing here
// could use some touch-up anyway :)
CString html =
CString("Version:0.9\r\n") +
CString("StartHTML:00000000\r\n") +
CString("EndHTML:00000000\r\n") +
CString("StartFragment:00000000\r\n") +
CString("EndFragment:00000000\r\n") +
CString("<!DOCTYPE>\r\n") +
CString("<HTML>\r\n") +
CString("<BODY>\r\n") +
CString("<!--StartFragment-->\r\n") +
CString("<a href=\"") + locationUrl + CString("\">") + locationName + CString("</a>") +
CString("<!--EndFragment-->\r\n") +
CString("</BODY>\r\n") +
CString("</HTML>");
// I'm making the assumption here that 1 byte == 1 char as we're
// going to work with UTF-8
CString startHtml, endHtml, startFragment, endFragment;
startHtml.Format(_T("StartHTML:%08u"), html.Find(_T("<HTML>")));
endHtml.Format(_T("EndHTML:%08u"), html.GetLength() -1);
startFragment.Format(_T("StartFragment:%08u"), html.Find(_T("<!--StartFragment-->")));
endFragment.Format(_T("EndFragment:%08u"), html.Find(_T("<!--EndFragment-->")));
html.Replace(_T("StartHTML:00000000"), startHtml);
html.Replace(_T("EndHTML:00000000"), endHtml);
html.Replace(_T("StartFragment:00000000"), startFragment);
html.Replace(_T("EndFragment:00000000"), endFragment);
// Allocate global memory for transfer
HGLOBAL hClipboardData = GlobalAlloc(GMEM_MOVEABLE |GMEM_DDESHARE, html.GetLength() + 4);
// Put your string in the global memory...
char *ptr = (char *)GlobalLock(hClipboardData);
strcpy(ptr, CT2A(html, CP_UTF8));
GlobalUnlock(hClipboardData);
// copy to the clipboard
::OpenClipboard(0); //browserHwnd
::EmptyClipboard();
::SetClipboardData(CF_HTML, hClipboardData);
::CloseClipboard();
//MessageBox(NULL, html, _T("Copied to Clipboard"), 0);
return S_OK;
}
That's a lot of code. However, because I'm inefficient with C++,
it should be reasonably easy to follow :) . Here's what's going
on:
- First, I get the URL and Title from IE
- Next, I register the HTML Clipboard format using the known
"HTML Format" string
- Then, using some ugly string replace code, I create the
properly formatted clipboard data. Note that HTML written to the
clipboard needs to follow a format that includes context: typically
a valid HTML doc, even when you're just pasting a little link.
- Once I have all that ugly string replace done, I put the string
into global memory
- Once in global memory, I open the clipboard with an hWnd of 0
to allow anyone to get at the data
- I then empty the current contents (how rude!), add the new
data, close the clipboard and return Success.
Now if you run it, you should be able to click the toolbar
button and get the title and URL pasted into LiveWriter (or Word or
whatever).
Cleaning up the URL
Ok, now that copy & paste works, it's time to do the
cleaning. Here's a reminder of what FeedBurner URLs with tracking
look like:
/blog/2011/02/17/asynchronous-web-and-network-calls-on-the-client-in-wpf-and-silverlight-and-net-in-general?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+PeteBrown+%28Pete+Brown%27s+Blog%29
What I need to do is get rid of anything following utm_source up
to the next ampersand or end of string. Then the same with
utm_medium and utm_campaign. I'd just as soon get rid of everything
after the ? but that will break a few blogs that don't yet use
SEO-friendly URLs. I also don't want to be dependent on the
specific order of those parameters.
To help, I added an additional private function to the class. In
the .h, I added:
CString RemoveParameter(CString url, CString parameter);
Then in the class itself, I modified the Exec function and added
the implementation of the new function.
CString CBlogUrlSnaggerAddIn::RemoveParameter(CString url, CString parameter)
{
int paramListStart = 0;
if (paramListStart = url.Find(_T("?")) > 0)
{
int paramStart = url.Find(parameter, paramListStart);
int paramEnd = url.Find(_T("&"), paramStart);
if (paramEnd <= 0) paramEnd = url.GetLength() -1;
return url.Mid(0, paramStart) + url.Mid(paramEnd + 1);
}
else
{
return url;
}
}
STDMETHODIMP CBlogUrlSnaggerAddIn::Exec(
const GUID *pguidCmdGroup, DWORD nCmdID,
DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut)
{
// This is the method where all the action happens
//MessageBox(NULL, _T("Hello, World!"), _T("Greetings"), 0);
CComBSTR locationUrl;
CComBSTR locationName;
m_spWebBrowser->get_LocationURL(&locationUrl);
m_spWebBrowser->get_LocationName(&locationName);
//MessageBox(NULL, locationUrl, locationName, 0);
// register the HTML Format clipboard data format
static int CF_HTML = 0;
if(!CF_HTML) CF_HTML = RegisterClipboardFormat(_T("HTML Format"));
// remove the FeedBurner parameters from the URL
CString cleanedUrl = locationUrl;
cleanedUrl = RemoveParameter(cleanedUrl, _T("utm_source"));
cleanedUrl = RemoveParameter(cleanedUrl, _T("utm_medium"));
cleanedUrl = RemoveParameter(cleanedUrl, _T("utm_campaign"));
// check for a final ? (meaning no other parameters)
if (cleanedUrl.Right(1) == _T("?"))
cleanedUrl = cleanedUrl.Left(cleanedUrl.GetLength()-1);
CString fragment = CString("<a href=\"") + cleanedUrl + CString("\">") +
locationName + CString("</a>");
// Build the HTML String. Yeah, I can probably use the Format* methods and
// this approach is likely very inefficient. My string processing here
// could use some touch-up anyway :)
CString html =
CString("Version:0.9\r\n") +
CString("StartHTML:00000000\r\n") +
CString("EndHTML:00000000\r\n") +
CString("StartFragment:00000000\r\n") +
CString("EndFragment:00000000\r\n") +
CString("<!DOCTYPE>\r\n") +
CString("<HTML>\r\n") +
CString("<BODY>\r\n") +
CString("<!--StartFragment-->\r\n") +
fragment + CString("\r\n") +
CString("<!--EndFragment-->\r\n") +
CString("</BODY>\r\n") +
CString("</HTML>");
// I'm making the assumption here that 1 byte == 1 char as we're
// going to work with UTF-8
CString startHtml, endHtml, startFragment, endFragment;
startHtml.Format(_T("StartHTML:%08u"), html.Find(_T("<HTML>")));
endHtml.Format(_T("EndHTML:%08u"), html.GetLength() -1);
startFragment.Format(_T("StartFragment:%08u"), html.Find(_T("<!--StartFragment-->")));
endFragment.Format(_T("EndFragment:%08u"), html.Find(_T("<!--EndFragment-->")));
html.Replace(_T("StartHTML:00000000"), startHtml);
html.Replace(_T("EndHTML:00000000"), endHtml);
html.Replace(_T("StartFragment:00000000"), startFragment);
html.Replace(_T("EndFragment:00000000"), endFragment);
// Allocate global memory for transfer
HGLOBAL hClipboardData = GlobalAlloc(GMEM_MOVEABLE |GMEM_DDESHARE, html.GetLength() + 4);
// Put your string in the global memory...
char *ptr = (char *)GlobalLock(hClipboardData);
strcpy(ptr, CT2A(html, CP_UTF8));
GlobalUnlock(hClipboardData);
// copy to the clipboard
::OpenClipboard(0); //browserHwnd
::EmptyClipboard();
::SetClipboardData(CF_HTML, hClipboardData);
::CloseClipboard();
//MessageBox(NULL, html, _T("Copied to Clipboard"), 0);
return S_OK;
}
That's it! It's all working. I tested it with additional
parameters and validated that it didn't screw them up. Of course, I
may still have to do some slight cleaning of URLs for cases when
people (like me) put the site name in the title), but otherwise
this should save me a good bit of work every week. Plus, I learned
a little C++ along the way. Win!
Keep in mind, I'm relearning C++. If I did something
dumb, don't hesitate to (nicely) point it out in the comments,
especially if it's something other people shouldn't repeat in their
own code.
(note: this was all tested using 32bit IE8 on Win7 x64: Works On
My Machine)
*** UPDATE: New source code and an update in
this blog post. ***