Comments Locked

38 Comments

Back to Article

  • FrankyJunior - Sunday, April 30, 2006 - link

    For anyone that wants to try Dragon, I just noticed that the preferred version is in the CompUSA ad today for $99.

    Never would have looked twice at it if I hadn't read this article yesterday.
  • NullSubroutine - Thursday, April 27, 2006 - link

    are we to the day when i say 'computer' and it does what i want, and when i time travel by going around the sun ill be confused when they hand me a mouse and keyboard when wanting to use a computer?
  • JarredWalton - Thursday, April 27, 2006 - link

    Almost. And if you go around the sun *backwards* you can travel through time in the other direction. :D
  • quanta - Tuesday, April 25, 2006 - link

    How about a review based on http://www.voicebox.com">VoiceBox Tehnologies products? It was demonstrated on Discovery Channel, and it seems to work without extensive voice training, and it actually _understand_ human speeches. The Discovery Channel can be found in http://www.exn.ca/dailyplanet/view.asp?date=3/13/2...">here.
  • rico - Tuesday, April 25, 2006 - link

    Where did you find Dragon Pro for $160? I thought it ususally cost about $800. Thanks.
  • JarredWalton - Tuesday, April 25, 2006 - link

    Heh, sorry - got "Preferred" and "Professional" mixed up. I'm not entirely sure what Pro includes, i.e. "Comes with a full set of network deployment tools."

    Trying to surf through Nuance's site is a bit tricky, and finding prices takes some effort as well. I think the only difference between Standard and Preferred is the ability to transcribe recordings in preferred - can anyone confirm for sure? I asked Nuance and didn't get a reply.
  • Tabah - Sunday, April 23, 2006 - link

    Excellent article/review. Here's the question I've been wondering. Personally I use DNS for blogging and generally anything that requires excessive typing. A friend of mine on the other hand swears by IBM ViaVoice. Any chance we could get a comparison article/review at a later date?
  • JarredWalton - Tuesday, April 25, 2006 - link

    I will try to get in touch with IBM. I'm sure they wouldn't mind participating in a follow-up article.
  • Tabah - Tuesday, April 25, 2006 - link

    Oddly enough ViaVoice is licensed by Nuance so you might have a better chance talking to them. The main reason I'd like to see a comparison between VV and DNS isn't so much because they're made/released by the same company, but because off the cost difference between them. Like I said before I really like DNS but VV at the high end (VV Pro USB vs DNS Pro) is still a few hundred dollars cheaper.
  • Poser - Sunday, April 23, 2006 - link

    Listening to the dictation files, I was amazed that all the punctuation was spoken. I would have expected that they would (or could) be replaced by using a non-speech sound. Something along the lines of a click of the tongue for a comma -- there's a good number of distinct sounds you can make with your tongue that we don't have words for but that anyone could recognize and make. Think of "The Gods Must be Crazy" and the language used by the Kalahari bushmen for an extreme example.

    Also, thanks for the article, it was really interesting and potentially very helpful! I'll hold off until Vista hits and I see some comparisons, but I'm certain now that I'll end up using one of the two.
  • JarredWalton - Sunday, April 23, 2006 - link

    Isn't there some comedy routine by an older gentleman that does the whole "verbalize punctuation" shtick? One of the things I might look at in the follow-up article is showing how Dragon does when turning on automatic punctuation. It will attempt to insert periods, commas, and question marks (at least, I think it does question marks) depending on how you speak the text. Obviously, that means you have to be a lot more careful when reading/dictating.

    I found it more useful to manually dictate my punctuation, since on frequent occasions I will pause midsentence to try and think what I want to say -- or because of some interruption. Basically, as a writer, punctuation is something that I take pretty seriously. DNS does pretty well with getting it right, but it also makes plenty of mistakes.
  • Admiral Ackbar - Monday, April 24, 2006 - link

    Victor Borge. Its called phonetic punctuation. It was one of the funniest things I have ever seen (I had the privelege of seeing him not long before he died).

    Actually though, it could work and its quicker than actually saying the word period or question mark.
  • JarredWalton - Tuesday, April 25, 2006 - link

    I bet it takes a hell of a lot of practice, too! Especially if you want to speak at a reasonable clip. I remember laughing my butt off at Victor Borge's routine quite a few years ago. On the bright side, more people might learn how to use proper punctuation!

    You also have to worry about the speech recognition software starting to recognize random noises (like a cough) as actual dictation. That happens already, but usually Dragon is smart enough to realize that my cough was merely a loud noise. Sometimes I get the random "the" from it, though.
  • Tujan - Saturday, April 22, 2006 - link

    I would be interested in knowing exactly what the program does. Something more acknowledged towards its features,interaction ect. Rather than a somewhat comparison between two programs - a somewhat benchmark.

    For example - you mention command mode. But dont get any further involved with what that encapsulates. That alone,has its limitations Im sure. Yet Im am also sure that many might want to know exactly what it is about. For example Start-My Documents-FolderName-Open...and so on. Is this how it works ? Or something like the HTPC scenario in wich you Query your favorite TV show - "Channel-channel name",..Or 'program name-file name-open'' . For the HTPC.

    Everybody should know what a vaccuum cleaner can do for you. Ya know. But what can you do for your vacuum cleaner.

    I imagine (note imagine'yes),given speech recognistion what well enough along,you could utilize a command line interface,and programmers would be able to program more quickly,and easily. Other than having your vacuum cleaner attack you ya know,you could do something like 'Dir - listing of directories. Or MD - make directories.

    Dont know any programming code,so anything other than exampling DOS command line.STill you could see what Im getting at. Program your HTML for example.

    But within the Windows environment,you could ask how well the program takes commands,and multitasks. Since you could use the wave file to do this. and so on.

    Im just curious. Dont see a lot of interesting software reviews dealing with the nuts and bolts of the application itself lately.

    Try a ram drive with that - take the chains off maybe ?
  • Ardemus - Friday, April 21, 2006 - link

    1) How was the software trained? Were you using "normal" or "dictation" speech paterns?

    2) Dragon may do much better with a wav over a real time system because it can read ahead and analyze the whole file.

    3) Does dragon give up resources when other applications ask for them?

    4) What sort of errors were made? How many errors are there after a spell and gramar check in MS word?

    5) Can you correct the errors in each program and scan again, to measure the improvement?

    6) I've heard that you can overstress and damage your vocal cords through speech recognition (RSI of the voice). Have you researched that?

    7) How often did both packages make the same mistakes? If you ran it through both packages in real time minimal mode, then DNS in several different speeds, could you run an algorythm to on the different results to increase accuracy?

    Nick Burger
  • JarredWalton - Friday, April 21, 2006 - link

    1 -- Both were trained in the same manner, basically me speaking the text, but doing my best to enunciate words a little better than I might do in the real world. Besides, good fiction is a useful skill to have, particularly if you're speaking with business people.

    2 -- That's entirely possible. One of the odd things is that the accuracy shown in my dictation benchmarks doesn't seem to correspond with my own personal experience of trying to use the software. It may simply be the way that I speak when trying to write articles, but I find that Microsoft is far worse in normal use. That's not a very scientific method, but I can't emphasize enough how much more difficult I find Microsoft's speech interface is to use.

    3 -- Dragon runs as a normal priority process, and when you're dictating with the accuracy set to "medium" it uses 20 to 50% of the processor time (on a single core Athlon 64 2.4 GHz). The memory footprint is pretty large, at about 150 to 200 MB. As far as I can tell, it will not use more than 200 MB -- during testing, I watched RAM usage on the "maximum accuracy" configuration, because I was curious to see if the switch from 1 GB on my old system to 2 GB on my new system would help. It did not. (the total size of my database/voice files is currently just over 300 MB.)

    I also noticed on my old system that Dragon requires a fair amount of hard disk access. I was copying several gigabytes of data from one computer to another computer (over gigabit ethernet) and Dragon's responsiveness dropped way off. It was still accurate, but rather than speaking and seeing the text a second or so later, there was a four or five second pause for most sentences.

    4 -- I included a link to a zip file in the article for anyone interested in looking at specific errors. The text files were compared using WinDiff, and I manually counted errors. (I was somewhat lenient, in that I allowed "speech-recognition" to match "speech recognition" -- stuff like that.)

    5 -- Dragon has definitely been "trained" on the document. Microsoft seems to do its own thing in terms of training, so all I could do is make sure that all of the words used were known by the speech engine. When you make an error using Microsoft's tool, as far as I know you have to correct with the keyboard. You can't just tell it to select the misinterpreted words and provide the correct interpretation. Perhaps it's possible to switch to command mode, tell the application to select something, then switch to dictation mode and give the correct spelling... at that point, you're far better off using the mouse and keyboard, and if you can't use those then you're much better off using Dragon's interface.

    6 -- Ithet's entirely possible, and laryngitis certainly doesn't help speech recognition at all. You definitely don't want to get in the habit of speaking really loudly, so it's best to train the software in a somewhat subdued voice (in my opinion). I would say the most important thing is to do everything in moderation; sitting at a computer dictating for 12 hours a day is going to be just as harmful in the long run as sitting at a computer typing 12 hours a day.
  • bobsmith1492 - Saturday, April 22, 2006 - link

    "Besides, good fiction is a good skill to have when... "

    :P Kind of like Isaac Asimov?
  • JarredWalton - Saturday, April 22, 2006 - link

    See what I get for not proofing carefully? LOL - that's the type of error I get most of the time. "A" for "the" is another common one.
  • Gioron - Friday, April 21, 2006 - link

    My brother swears by DNS, but using it myself and watching him use it I just can't stand going that slow. I've gotten to the point where I can type much faster than the speach recognition can handle it, and stopping to correct it just slows things down to a painful level. Of course, I'd probably have to learn to live with it if my wrists started bothering me, but until then...

    And then there's this bash.org quote:
    http://www.bash.org/?34776">http://www.bash.org/?34776
    <www666> this is so cool I'm typing with Dragon NaturallySpeaking in mIrc
    <www666> no more typing
    <LameLLama> www: try "thlash exit"
    *** www666 has quit IRC (Leaving)
    *** www666 ([email protected]int.ca) has joined #visualbasic
    <www666> Hugh Masters
    <www666> you basterdes
  • hans007 - Friday, April 21, 2006 - link

    i used speech recognition with office xp when it came out. that was awful.

    my acura navigation has speech recognition which is also not well, that useful, its still easier to use buttons.


    i honestly think it will never be better than just buttons.
  • Googer - Saturday, April 22, 2006 - link

    BMW 7 series Speech recognition is about 50-75% accurate (my guess) and some users have more luck with it than others.
  • Googer - Friday, April 21, 2006 - link

    I think you should re-benchmark these on a system that is not overclocked. Overclocking may have contibuted to errouneous test results. It is possible that some of the benchmarks could have been better on a normal system. Also I am surprised this was not tested on a Intel Syststem. Prehaps one of the programs may benefit from the Netburst Architeture with or with out dual core.


    Also I would love to download the Dication and Normal Voice wav files, so I can understand the differance between them. Thanks for the article, it came in perfect time; Someone who is handicaped was asking me about this last night.
  • JarredWalton - Friday, April 21, 2006 - link

    I'll see about putting up some MP3s of the wave files -- of course, that will open the door for all of you to make fun of how I speak. LOL

    In case this wasn't entirely clear in article, this was all done on my system that I use every day for work. It's overclocked, and it's been that way for six months. I run stress tests (Folding at Home -- on both cores) all the time. I would be very surprised if the overclock has done anything to affect accuracy, especially considering that I did run some tests on a couple other systems that were not overclocked, and basically removed them from this article because they would have simply taken more time to put in the article, and they didn't give me any new information.

    It's pretty obvious that neither of these algorithms benefit from multiple processing cores -- HyperThreading, dual core, SMP, whatever. I also wasn't sure how much interest there would be from people in this topic, but if a lot of people want to know how this runs on Intel systems I could go back and look at one. One thing worth noting is that SysMark 2004 does include Dragon NaturallySpeaking version 6.5 as one of the tests. Of course, the results are buried in the composite scores.
  • JarredWalton - Friday, April 21, 2006 - link

    MP3 links available:

    http://www.anandtech.com/multimedia/showdoc.aspx?i...">http://www.anandtech.com/multimedia/showdoc.aspx?i...

    Note that DNS only uses WAV files (AFAICT), but uploading 45MB WAV files seems pointless. Convert them to WAVs if you want to try them with Dragon.
  • Googer - Saturday, April 22, 2006 - link

    Excellant job on the dictation/wav files, you are a very good reader and have a nice clear and concice voice. ;ThumbsUP)
  • stelleg151 - Friday, April 21, 2006 - link

    Cool article. I hope that voice recognition continues to improve, for I think it could be incredibly useful for areas like HTPC, or as you said messenging while doing other things (gaming).
  • Zerhyn - Friday, April 21, 2006 - link

    Have you ever tried out speech recognition and been underwhelmed? To you yearn to play the role of Scotty and call out..

    ?
  • PrinceGaz - Friday, April 21, 2006 - link

    Yes, that was the first thing I noticed before I even started reading the article. Maybe they used speech-recognition software to enter that.

    I think they should have an editor (or at least let another contributor read what others have written) who has to approve an article before it goes live as the current number of tyops is unforgiveable ;)
  • JarredWalton - Friday, April 21, 2006 - link

    I'm doing my best to catch typos before anything goes live, but after being up all night trying to finish off this article, I went to post and realized I didn't have a title or intro. So, I put one in using Dragon, but my diction goes to put when I'm tired, as does my eyesight and proofing ability. One typo in a 44 word intro (I didn't proof/edit it at all) isn't too bad for the software. Bad for me? Maybe, but mistakes do happpen. :)
  • johnsonx - Friday, April 21, 2006 - link

    One nice thing about Dragon, despite the high CPU utilization shown in the article, is that it will run quite happily with very lowly systems. I have a customer who uses it all day long on PentiumIII-850's with only 512Mb RAM (the max for those particular systems). The heaviest user there recently upgraded to a low-end Sempron64 with a gig of RAM, and he says the overall system is far more responsive (of course), but Dragon's operation isn't radically better; it worked great on the PIII, and works great now.
  • JarredWalton - Friday, April 21, 2006 - link

    That's definitely true -- if you look at how accuracy scales with CPU usage, doubling and even tripling the processor time comes with only incremental increases in accuracy. I do have to say that I noticed it being a little sluggish on my single core system when I was multitasking, but obviously I push my computers a little harder than a lot of people. Depending on what you're willing to live with in terms of speed, I'm sure both Dragon and Microsoft speech recognition can work on a Pentium III level system.
  • LanceM - Friday, April 21, 2006 - link

    So is that selection typical Asimov? If so, it has convinced me to never bother reading any of his works.

    His ideas/plots/etc. may be interesting, but I don't think I could handle phrases like, "as if she were some dried-up, old-maid teacher." Give me Joseph Conrad or William Faulkner.
  • Dfere - Monday, April 24, 2006 - link

    Asimov is classic Sci-Fi- pulp, which usually had a gritty detective-novel appeal. Hs works are in large part murder mystery type novels. You have to understand the nature of the literature, the history and the author. I don't think a critique is deserved until then.

    Most Sci Fi writers of any ability first master imaginative concepts and apply them, even Drke and Sirling.

    I give Kudos to the staff for including literary comments, the poster who said this should not be a book of the month club lives a very one dimensional life.
  • Shoal07 - Friday, April 21, 2006 - link

    What makes Asimov special is many of his ideas in sci fiction are comming true today or are atleast on the horizon. Asimov shaped the way many of us picture the future.
  • goinginstyle - Friday, April 21, 2006 - link

    Why does the Anandtech staff revert to literary quotes in their reviews now? This is a computer website, not a book club.
  • JarredWalton - Friday, April 21, 2006 - link

    I read Asimov's foundation series as a teenager, and I loved it. He gave me lots of fanciful dreams about where technology might go in the future, and even though some of the writing styles have changed over the years, I still find a lot of these old sci-fi books to be entertaining. You should try reading War of the Worlds if you think that quote was bad. LOL

    Sorry if some of you didn't like the quote. Everyone has their own dislikes and likes, but in the end it's just an introduction. I hope to one day be able to yell at my computer and have it properly understand what I say, as well as the context (i.e., yelling means something is going wrong, and maybe it can help me out). Will we ever get there? Probably some day, but whether it happens in our lifetimes or not is anyone's guess.
  • NegativeEntropy - Saturday, April 22, 2006 - link

    I like the use of quotes -- though it does remind me a bit of being in English/writing class ("Always do something in the introduction to get your audience's attention...").

    On the subject of "classic" Sci-fi writers, I also still enjoy old school Heinlein. Though his characters can get a bit repetitive across his pile of works, many of the science ideas are still valid (and I share much of his apparent personal philosophy).

    On the actual article -- thanks for doing it. I have been curious where this technology was at in terms of every day usage and hardware requirements.

    Regarding CPU usage, it's possible DNS attempts to use whatever resources are available based on preferences. i.e. on minimum, it attempts to impact the system minimally, regardless of the CPU resources available; say 25% on min, 50% on med and 95% on max with the percentage staying relatively consistent from a P3 1GHz to an A64 2.6GHz. This would explain its reported good scaling from system to system. If you want to test it, underclock your A64 system to half its frequency and compare utilization at the medium setting.
  • kristof007 - Friday, April 21, 2006 - link

    Here at Anandtech you can always count on to find something else. Great article! I tried out speech recognition a few years back and I got frustrated with it over one thing or another so I just dropped it and went back to typing. I've been typing for about 8 years now. I never learned the "proper" way to type where every finger has a spot. Anyway I hope Vista will make speech recognition WAAY better so that it could be used around the OS AND for speech recognition.

    Thanks for the article!

Log in

Don't have an account? Sign up now